4 research outputs found

    Ultra-fast interconnect driven cell cloning for minimizing critical path delay

    No full text
    In a complete physical synthesis flow, optimization transforms, that can improve the timing on critical paths that are already well-optimized by a series of powerful transforms (timing driven placement, buffering and gate sizing) are invaluable. Finding such a transform is quite challenging, to say nothing of efficiency. This work explores innovative cloning (gate duplication) techniques to improve timing-closure in a physical synthesis environment. With a buffer-aware interconnect timing model, new polynomial-time optimal algorithms are proposed for timing-driven cloning, including both finding optimal sink partitions (identifying the fan-outs) for the original and the duplicated gates, as well as physical locations for both gates. In particular, we present an O(m)-time optimal algorithm to minimize the worst slack if the original gate is movable, and an O(m log m) algorithm if the original gate is fixed, where mm is the number of fan-outs. To the best of our knowledge, this work is the first one considering the timing-driven cloning problem under a buffer-aware interconnect delay model. For a hundred testcases in 45nm technology node, we demonstrate significant timing improvement due to our cloning techniques as compared to other existing timing-optimization transforms. Extensions to other factors, such as wirelength, FOM and placement obstacles are further discussed. Copyright 2010 ACM

    Timing-Driven Macro Placement

    Get PDF
    Placement is an important step in the process of finding physical layouts for electronic computer chips. The basic task during placement is to arrange the building blocks of the chip, the circuits, disjointly within a given chip area. Furthermore, such positions should result in short circuit interconnections which can be routed easily and which ensure all signals arrive in time. This dissertation mostly focuses on macros, the largest circuits on a chip. In order to optimize timing characteristics during macro placement, we propose a new optimistic timing model based on geometric distance constraints. This model can be computed and evaluated efficiently in order to predict timing traits accurately in practice. Packing rectangles disjointly remains strongly NP-hard under slack maximization in our timing model. Despite of this we develop an exact, linear time algorithm for special cases. The proposed timing model is incorporated into BonnMacro, the macro placement component of the BonnTools physical design optimization suite developed at the Research Institute for Discrete Mathematics. Using efficient formulations as mixed-integer programs we can legalize macros locally while optimizing timing. This results in the first timing-aware macro placement tool. In addition, we provide multiple enhancements for the partitioning-based standard circuit placement algorithm BonnPlace. We find a model of partitioning as minimum-cost flow problem that is provably as small as possible using which we can avoid running time intensive instances. Moreover we propose the new global placement flow Self-Stabilizing BonnPlace. This approach combines BonnPlace with a force-directed placement framework. It provides the flexibility to optimize the two involved objectives, routability and timing, directly during placement. The performance of our placement tools is confirmed on a large variety of academic benchmarks as well as real-world designs provided by our industrial partner IBM. We reduce running time of partitioning significantly and demonstrate that Self-Stabilizing BonnPlace finds easily routable placements for challenging designs – even when simultaneously optimizing timing objectives. BonnMacro and Self-Stabilizing BonnPlace can be combined to the first timing-driven mixed-size placement flow. This combination often finds placements with competitive timing traits and even outperforms solutions that have been determined manually by experienced designers

    Broadening the Scope of Multi-Objective Optimizations in Physical Synthesis of Integrated Circuits.

    Full text link
    In modern VLSI design, physical synthesis tools are primarily responsible for satisfying chip-performance constraints by invoking a broad range of circuit optimizations, such as buffer insertion, logic restructuring, gate sizing and relocation. This process is known as timing closure. Our research seeks more powerful and efficient optimizations to improve the state of the art in modern chip design. In particular, we integrate timing-driven relocation, retiming, logic cloning, buffer insertion and gate sizing in novel ways to create powerful circuit transformations that help satisfy setup-time constraints. State-of-the-art physical synthesis optimizations are typically applied at two scales: i) global algorithms that affect the entire netlist and ii) local transformations that focus on a handful of gates or interconnections. The scale of modern chip designs dictates that only near-linear-time optimization algorithms can be applied at the global scope — typically limited to wirelength-driven placement and legalization. Localized transformations can rely on more time-consuming optimizations with accurate delay models. Few techniques bridge the gap between fully-global and localized optimizations. This dissertation broadens the scope of physical synthesis optimization to include accurate transformations operating between the global and local scales. In particular, we integrate groups of related transformations to break circular dependencies and increase the number of circuit elements that can be jointly optimized to escape local minima. Integrated transformations in this dissertation are developed by identifying and removing obstacles to successful optimizations. Integration is achieved through mapping multiple operations to rigorous mathematical optimization problems that can be solved simultaneously. We achieve computational scalability in our techniques by leveraging analytical delay models and focusing optimization efforts on carefully selected regions of the chip. In this regard, we make extensive use of a linear interconnect-delay model that accounts for the impact of subsequent repeated insertion. Our integrated transformations are evaluated on high-performance circuits with over 100,000 gates. Integrated optimization techniques described in this dissertation ensure graceful timing-closure process and impact nearly every aspect of a typical physical synthesis flow. They have been validated in EDA tools used at IBM for physical synthesis of high-performance CPU and ASIC designs, where they significantly improved chip performance.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78744/1/iamyou_1.pd
    corecore