A novel algorithm for rectangular floorplanning with guaranteed 100% area utilization is used to construct new sets of floorplanning benchmarks. By minimizing the maximum block aspect ratio subject to a zero-dead-space constraint, example zero-dead-space (ZDS) floorplans matching the area profiles of any existing floorplanning benchmark circuits can be constructed. A mathematical analysis shows that the aspect ratios of the ZDS benchmarks' blocks are uniformly bounded within [1, 3] in most cases. Block packings produced by the Parquet, B*-tree, TCG-S, and BloBB packages on these new benchmarks are compared to the optimal-area floorplans produced by the ZDS algorithm.
INTRODUCTION
How much room for improvement is there in existing physical design algorithms? The answers for floorplanning, partitioning, placement, and routing algorithms will help researchers to determine the phases of the physical design process that need the most attention. Previous studies on the optimality of placement algorithms ( [5] , [7] ) have shown a * Financial Support from Semiconductor Research Consortium Contract 2003-TJ-1019 is gratefully acknowledged.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. significant gap (up to 2× in some cases) in the performance of existing algorithms from the optimal solutions. Similar studies for partitioning algorithms [7] showed a much smaller gap. In this paper, a new technique for benchmarking the area optimality of existing floorplanning algorithms is introduced, analyzed, and applied to a few available leading floorplanning packages.
Floorplanning is a critical phase in the physical design of VLSI circuits when a hierarchical design methodology is used. The floorplanning problem is to shape and place N rectangular modules of given areas in the plane. The floorplanning objective is typically to minimize either (a) the area of the smallest rectangle circumscribing the blocks, (b) the total wirelength of the circuit, (c) its timing performance, or, most likely, (d) some combination of these. With these objectives, constraints on the aspect-ratios of the blocks must be imposed in order to ensure physically useful solutions. If the blocks must have a fixed aspect ratio they are called hard, otherwise they are called soft and are allowed to be shaped.
The importance of good floorplanning algorithms has increased significantly in the last few years, as design hierarchy and IP blocks are used ever more intensively to reduce design complexity. However, their evaluation has been increasingly difficult, for two main reasons.
(i) The effects of a good floorplanning algorithm are evident in later stages of the physical design process.
(ii) The available benchmarks are old and small in size.
While the performance of most algorithms is excellent on these small benchmarks, there is a valid concern about the performance of the algorithms on contemporary, more complex designs with many blocks (say, 100 -1000). Such instances may arise, for example, when certain levels of the logic hierarchy are flattened [16] . Many floorplanning algorithms have been developed in recent years, varying mostly in the representation of the geometric relationships among the modules. They can be divided into two major categories: slicing and non-slicing algorithms. The first slicing algorithms were developed in the 80's ( [17] , [21] ). In the 90's non-slicing algorithms became more popular, especially after the introduction of the Sequence Pair [14] representation. Other non-slicing representations include TCG [11] , TCG-S [12] , B*-tree [6] , CBL [10] , O-tree [9] , BSG [15] , and so on.
In this paper, a simple slicing algorithm that guarantees a ZDS floorplan while attempting to minimize the maximum block aspect ratio is defined and analyzed. The analysis shows that the block aspect ratios obtained are uniformly bounded in terms of the area variation of the given blocks. Existing methods typically attempt to maximize area utilization subject to aspect-ratio bounds. By recasting the block aspect ratios computed by the ZDS algorithm as constraints, we interpret the ZDS floorplans it produces as new optimal-area benchmarks for existing algorithms. Thus, new floorplanning benchmarks with known ZDS solutions and realistic, uniform bounds on all blocks' aspect ratios are generated. Experiments then compare the results of a few state-of-the-art floorplanning packages on the new benchmarks. Early results also suggest that further refinements of the ZDS algorithm may render it useful in and of itself as a component in floorplanning and placement algorithms. Previous works on this subject ( [23] , [19] ) analyzed the theoretical upper bounds on the total area achieved by slicing floorplans of soft blocks.
The main limitation of this work to date is that it does not consider wirelength. Wirelength optimality in combination with area optimality will be addressed in future work.
The remainder of this paper is organized as follows. Section 2 presents a ZDS algorithm. Section 3 outlines the proof that, under very mild regularity assumptions, this algorithm creates benchmarks with small, uniform bounds over all blocks' aspect ratios. Section 4 compares experimental results of some leading floorplanning algorithms on our area-optimal benchmarks. Section 5 describes a few possible uses of the ZDS framework beyond benchmarking. The paper is concluded in Section 6.
BENCHMARK CONSTRUCTION
The input to the ZDS generator is a set of block areas with no aspect ratio constraints at all; these may be extracted from existing benchmarks. The output of the ZDS generator is a floorplan, i.e., a set of block shapes and locations, with realistic uniform bounds on the aspect ratios of all modules. Compared to the traditional floorplanning formulation, the objective and constraints for the ZDS algorithm are swapped. Rather than minimize unused area subject to block-aspect-ratio constraints, the benchmark generator implicitly seeks to minimize block aspect ratios while maintaining full area utilization at every step. To get a benchmark, we simply interpret the ZDS block-shape output as input constraints for algorithms in the traditional area-optimizing vein. If the benchmark block shapes are held fixed as equality constraints, then the ZDS floorplan is a hard-block packing benchmark. If the benchmark block shapes are allowed to vary within some interval containing the ZDS block shapes, then the ZDS floorplan is a soft-block packing benchmark.
The ZDS algorithm used here is based on recursive topdown area bipartitioning. At each step, the blocks in a region are separated into two groups such that the groups' sizes are as nearly equal as possible. The region is then cut parallel to its shorter side such that each group fits exactly into one of the regions. Cutting parallel to the shorter side keeps aspect ratios of subregions bounded in terms of the area variation among the blocks. Blocks are placed once they fill a sufficient fraction of their subregions; this fraction is expressed as the reciprocal of the parameter γ introduced below. Figure 1 shows the pseudocode for this ZDS algorithm: 1 By this definition (which we will use for the remainder of the paper) the aspect ratio of any block is always at least 1.
Recur on the subregions
Although Algorithm 2.1 can accept as input any values ρ(R) ≥ 1 and γ ≥ 1, the analysis in Section 3 shows that the generated floorplans display attractive properties for certain choices of ρ(R) and γ. More specifically, if we define
then appropriate values for γ and ρ(R) are
A trace of the algorithm on a simple 5-block example is illustrated in Figure 2 . 
ANALYSIS
The utility of Algorithm 2.1 rests on the fact that for nearly all realistic circuits, all the block aspect ratios it computes are guaranteed to lie within a single small interval of the form [1, γ + 1], when γ is defined as in (2). Hence, if the blocks are arranged in nonincreasing sorted order by area, the aspect ratios are bounded by one plus the maximum ratio of consecutive block areas, when this latter ratio exceeds 2. Otherwise, the aspect ratios are bounded above by 3. These facts are established here, under Assumptions 3.1 below. Assumption 3.1(a) can be rephrased as follows: the threshold fraction of subregion area that a block must occupy in order to be shaped and locked in place is not set above 1/2. Although these assumptions are stronger than necessary to achieve zero dead space and acceptably bounded block aspect ratios, they are not very restrictive on the sets of block areas that may be considered. Further discussion of the assumptions appears at the end of this section.
Derivation of the Aspect-Ratio Bound
Throughout this section, we consider the properties of Algorithm 2.1 under Assumptions 3.1. Due to page limits, only a sketch of the mathematical derivation of the aspect-ratio bound can be given here. Detailed proofs are available online in a technical report [20] .
The first lemma shows that, in order to bound the aspect ratios of the blocks, it suffices to bound the aspect ratios of the regions in which they are placed. The next lemma bounds the aspect ratio of sibling subregions in terms of their area ratio and the aspect ratio of their common parent subregion.
Lemma 3.2. Suppose subregion R is partitioned into subregions R1 and R2 with areas A1 and A2. Let
Theorem 3.1. In Algorithm 2.1,
From Lemma 3.2 and Theorem 3.1, we immediately obtain the following bound.
Corollary 3.1. Suppose subregion R is partitioned into subregions R1 and R2. Then 
Remarks
The assumption γ ≥ 2 presents no practical restriction on the sets of blocks that may be considered. It just means that the upper bound on block aspect ratios guaranteed by the analysis here for the given algorithm is at least 3. That is, consecutive-pairwise area bounds tighter than 2 (e.g., ai/ai+1 ≤ 1.5) are not guaranteed to reduce the maximum aspect ratio below what can be attained with ai/ai+1 ≤ 2.
Similarly, a large value of γ does not necessarily indicate any large aspect ratios in the final floorplan, as Figure 5 illustrates. In the figure, one large block occupies one subregion, and several small blocks occupy another subregion. Although the area ratio of the subregions may be arbitrarily large, the presence of sufficiently many small blocks used to fill the small subregion prevents any single block's aspect ratio from becoming large. For some designs, the presence of a few very large or very small blocks may result in a large value of γ, if γ is defined simply as max{2, maxi ai/ai+1}. However, a few simple preprocessing steps can usually be used to reduce this value significantly. The basic idea is simply to aggregate smaller, similarly sized blocks together until the aggregates are more comparable to larger blocks or sets of blocks. 
If max i =m ai/ai+1
Am/Ām, a few recursive iterations on Rm andRm can be used to reduce the bound further. Although it is trivial to construct examples where this preprocessing will be useless (e.g., when N = 2, or when m = N − 1), on practical examples with large N , the reduction in γ will likely be considerable, when m is sufficiently less than N .
AREA-OPTIMALITY STUDY
Algorithm 2.1 was implemented in C++/STL and applied to the MCNC benchmarks [13] and the largest of the GSRC [8] benchmarks without regard to connectivity and with aspect ratios unconstrained. The results were then interpreted as benchmarks that match the block-area characteristics of real circuits but have known optimal-area solutions. The new benchmarks differ from the original benchmarks in that they specify either (i) the exact aspect ratio of each block or (ii) a range of aspect ratios for each block that includes the ratio for the block computed by Algorithm 2.1. For most circuits, these aspect-ratio values and ranges are only slightly larger (or even smaller) than those in the original benchmarks. For a scalability analysis we also used as prototype a version of the largest benchmarks (ami49 and n300) with 10 times the number of blocks of the original circuit (ami49-10× and n300-10× respectively). We grouped the new benchmarks in a suite named FEKO-A (Floorplanning Examples with Known Optimal Area). The benchmarks and their optimal solutions can be found online [22] . Table 1 shows the characteristics of the circuits of the FEKO-A suite. The name of the original circuit, the number of blocks, the value of γ and the maximum aspect ratio of a block are displayed. In all the experiments, γ is set to the value from (2); hence, we interpret γ as both a circuit property and a ZDS-algorithm parameter. As expected, the aspect ratio of the blocks are bounded above by γ+1. With the exception of the apte circuit, the values of γ guarantee a relatively low maximum aspect ratio. Note that on the xerox example, all the blocks have aspect ratio less than 2. This means that the optimal block placement of the benchmarks is a valid solution of the soft version problem for the original floorplanning benchmark, where the blocks can be reshaped to any aspect ratio up to 2. The same is true for the n300 example, where the aspect ratio bound is 3 (as for all the GSRC benchmarks with soft blocks 2 ). Even for the hp example, where the maximum aspect ratio of the blocks is 2.01, a simple postprocessing procedure of stretching the floorplan in one of the coordinate directions can reduce the maximum aspect ratio of the 2 The value of γ for all the GSRC benchmarks is 2, so the Algorithm presented in this paper can solve them optimally. blocks to 1.67. Thus, we have created optimal-area solutions for all these floorplanning benchmarks, which as far as we know have never before been reported in the literature.
We used the FEKO-A circuits to evaluate the performance of four state-of-the art floorplanners whose code is available on the Internet.
• Parquet-2. Parquet [1] is a floorplanning package that uses the Sequence Pair [14] representation. The public release supports block shaping.
3 Parquet-2 can be downloaded online [18] .
• B*-tree. This package is based on the B*-tree geometric representation [6] . It can be downloaded online [3] . The public release does not support block shaping.
• TCG-S. TCG-S [12] is an extension of the well-known TCG [11] algorithm. The package can be downloaded online [3] and it does not support block shaping.
• BloBB (Block-packing with Branch-and-Bound) [4] . BloBB is a floorplanner for minimizing area only that solves small instances optimally (up to 12 blocks) and is using a hierarchical approach for larger instances. BloBB can be downloaded online [2] and it does not support block shaping.
We tested these four packages on the FEKO-A suite with all block aspect ratios held fixed. Every program was run 10 times and the best results were collected. All the packages were run in area minimization mode, since most of them support wirelength minimization as well, except for BloBB. Parquet-2, B*-tree and BloBB were run in a Linux RedHat 8.0 environment on a 2.2 GHz Pentium 4 processor (2 GB memory), while TCG-S was run on a SunBlade 1000 machine on Solaris 2.8 with a 750 MHz processor (2 GB memory). Table 2 shows the experimental results for each of the circuits in terms of dead space and runtime. The runtime statistics are also shown graphically in Figure 6 . Since each benchmark has a known optimal solution with zero dead space, the dead space (measured as a percentage of the total area of a floorplan) generated by an algorithm shows the area gap between the algorithm's result and the optimal solution. Table 2 : Experimental results for the FEKO-A examples. The dead space is the best of 10 runs, the runtime is for one run.
From the table and the figure, we draw the following conclusions:
• The BloBB package can find as expected the optimal solution for the three smallest benchmarks. For the other circuits, it finds non-optimal solutions of very good quality (12.06% from the optimal solution in the worst case).
• The performance of the other algorithms on the benchmarks with size up to 500 blocks is very good, 12% from the optimal in the worst case in the case of B*-tree, 15.3% in TCG-S, and 14.4% from the optimal in Parquet-2.
• For the largest benchmark (3000 blocks), Parquet's floorplan has dead space equal to 21.17%, while the dead space of B*-tree was 27.35%. TCG-S did not complete one run after 24 hours.
• BloBB shows very good scalability in terms of run time. Actually its runtime for large circuits is sometimes better compared to the smaller circuits, since in the former case it is run in a non-optimal hierarchical mode, while in the latter case it finds the optimal solution.
OTHER APPLICATIONS
While the primary motivation for Algorithm 2.1 was to create benchmarks with known optimal-area solutions and reasonable aspect ratios, it seems likely that extensions of the ZDS idea can have more uses in physical design.
• As shown in Section 4, Algorithm 2.1 can pack optimally 2 out of the 5 MCNC benchmarks and all the GSRC benchmarks (soft-block version) within the given aspect ratio bounds. This suggests that a topdown ZDS algorithm can be used as a stand-alone softblock packing algorithm for area minimization.
• Algorithm 2.1 does not consider wirelength, but there is some similarity between the recursive area-bipartitioning step of Algorithm 2.1 and the recursive process in partitioning-based placement algorithms. A multilevel floorplanner that can combine or alternate between these two steps has the potential of focusing on both objectives (area and wirelength), if the blocks' aspect ratios can be relaxed up to three (or even less than three with a penalty of some dead space). This approach can be applied, for example, at the coarser levels of a multilevel placement algorithm, where cells are grouped into clusters with no predetermined shape.
• A ZDS algorithm can also be used to determine the shapes of blocks in a hierarchical design, where close interaction with designers at the highest levels of design can help achieve very compact block packings.
CONCLUSION
A novel algorithm was used to create floorplanning benchmarks with known optimal-area solutions. The construction of the benchmarks is such that the aspect ratios of the modules are uniformly bounded in terms of the area variation of the given blocks. This fact, in combination with the capability of the program to create benchmarks with any block-area distribution given by the user, makes the benchmarks realistic and suitable for evaluation of floorplanning algorithms. Experiments were conducted with four state-ofthe-art floorplanning packages run in block-packing mode with no reshaping. For circuits with fewer than 500 blocks, the algorithms perform very well, but for larger benchmarks they may leave dead space of up to 27%. These examples can potentially help algorithm designers to fine tune their programs and understand some of the limitations of existing floorplanning and block-packing algorithms.
Algorithm 2.1 can be viewed as one solution to an alternative formulation of floorplanning in which the maximum aspect ratio is minimized subject to a strict zero-dead-space constraint. From this view, the performance of Algorithm 2.1 can likely be improved considerably in at least two ways: (i) using one level of aggregation to reduce maximum area variations of sibling partitions, as discussed in Section 3; (ii) using direct search on the set of available aspect ratios of the entire floorplanning region in order to reduce the maximum aspect ratio of any block in the final floorplan. Additionally, we expect that more sophisticated postprocessing analysis of each block's sequence of ancestor subregions may be useful in converting a large aspect ratio of a single isolated block to a set of small perturbations to each of a larger set of blocks.
The FEKO-A benchmarks were created for area optimization only with no netlist information. A trivial extension for the generation of benchmarks with optimal wirelength can include the creation of zero-wirelength nets with degree up to 4 and pins placed at the intersections of block boundaries. However, this approach has limitations on matching the profile of real circuits (for example, the absence of highdegree nets etc.). Future work will explore the generation of realistic examples with known optimal or near-optimal solutions in terms of both area and wirelength.
