73 research outputs found

    Floorplan-guided placement for large-scale mixed-size designs

    Get PDF
    In the nanometer scale era, placement has become an extremely challenging stage in modern Very-Large-Scale Integration (VLSI) designs. Millions of objects need to be placed legally within a chip region, while both the interconnection and object distribution have to be optimized simultaneously. Due to the extensive use of Intellectual Property (IP) and embedded memory blocks, a design usually contains tens or even hundreds of big macros. A design with big movable macros and numerous standard cells is known as mixed-size design. Due to the big size difference between big macros and standard cells, the placement of mixed-size designs is much more difficult than the standard-cell placement. This work presents an efficient and high-quality placement tool to handle modern large-scale mixed-size designs. This tool is developed based on a new placement algorithm flow. The main idea is to use the fixed-outline floorplanning algorithm to guide the state-of-the-art analytical placer. This new flow consists of four steps: 1) The objects in the original netlist are clustered into blocks; 2) Floorplanning is performed on the blocks; 3) The blocks are shifted within the chip region to further optimize the wirelength; 4) With big macro locations fixed, incremental placement is applied to place the remaining objects. Several key techniques are proposed to be used in the first two steps. These techniques are mainly focused on the following two aspects: 1) Hypergraph clustering algorithm that can cut down the original problem size without loss of placement Quality of Results (QoR); 2) Fixed-outline floorplanning algorithm that can provide a good guidance to the analytical placer at the global level. The effectiveness of each key technique is demonstrated by promising experimental results compared with the state-of-the-art algorithms. Moreover, using the industrial mixed-size designs, the new placement tool shows better performance than other existing approaches

    Voltage island-driven floorplanning.

    Get PDF
    Ma, Qiang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (leaves 78-80).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Background --- p.1Chapter 1.2 --- Floorplanning --- p.2Chapter 1.3 --- Motivations --- p.4Chapter 1.4 --- Design Implementation of Voltage Islands --- p.5Chapter 1.5 --- Problem Formulation --- p.8Chapter 1.6 --- Progress on the Problem --- p.10Chapter 1.7 --- Contributions --- p.12Chapter 1.8 --- Thesis Organization --- p.14Chapter 2 --- Literature Review on MSV --- p.15Chapter 2.1 --- Introduction --- p.15Chapter 2.2 --- MSV at Post-floorplan/Post Placement Stage --- p.16Chapter 2.2.1 --- """Post-Placement Voltage Island Generation under Performance Requirement""" --- p.16Chapter 2.2.2 --- """Post-Placement Voltage Island Generation""" --- p.18Chapter 2.2.3 --- """Timing-Constrained and Voltage-Island-Aware Voltage Assignment""" --- p.19Chapter 2.2.4 --- """Voltage Island Generation under Performance Requirement for SoC Designs""" --- p.20Chapter 2.2.5 --- """An ILP Algorithm for Post-Floorplanning Voltage-Island Generation Considering Power-Network Planning""" --- p.21Chapter 2.3 --- MSV at Floorplan/Placement Stage --- p.22Chapter 2.3.1 --- """Architecting Voltage Islands in Core-based System-on-a- Chip Designs""" --- p.22Chapter 2.3.2 --- """Voltage Island Aware Floorplanning for Power and Timing Optimization""" --- p.23Chapter 2.4 --- Summary --- p.27Chapter 3 --- MSV Driven Floorplanning --- p.29Chapter 3.1 --- Introduction --- p.29Chapter 3.2 --- Problem Formulation --- p.32Chapter 3.3 --- Algorithm Overview --- p.33Chapter 3.4 --- Optimal Island Partitioning and Voltage Assignment --- p.33Chapter 3.4.1 --- Voltage Islands in Non-subtrees --- p.35Chapter 3.4.2 --- Proof of Optimality --- p.36Chapter 3.4.3 --- Handling Island with Power Down Mode --- p.37Chapter 3.4.4 --- Speedup in Implementation and Complexity --- p.38Chapter 3.4.5 --- Varying Background Chip-level Voltage --- p.39Chapter 3.5 --- Simulated Annealing --- p.39Chapter 3.5.1 --- Moves --- p.39Chapter 3.5.2 --- Cost Function --- p.40Chapter 3.6 --- Experimental Results --- p.40Chapter 3.6.1 --- Extension to Minimize Level Shifters --- p.45Chapter 3.6.2 --- Extension to Consider Power Network Routing --- p.46Chapter 3.7 --- Summary --- p.46Chapter 4 --- MSV Driven Floorplanning with Timing --- p.49Chapter 4.1 --- Introduction --- p.49Chapter 4.2 --- Problem Formulation --- p.52Chapter 4.3 --- Algorithm Overview --- p.56Chapter 4.4 --- Voltage Assignment Problem --- p.56Chapter 4.4.1 --- Lagrangian Relaxation --- p.58Chapter 4.4.2 --- Transformation into the Primal Minimum Cost Flow Problem --- p.60Chapter 4.4.3 --- Cost-Scaling Algorithm --- p.64Chapter 4.4.4 --- Solution Transformation --- p.66Chapter 4.5 --- Simulated Annealing --- p.69Chapter 4.5.1 --- Moves --- p.69Chapter 4.5.2 --- Speeding up heuristic --- p.69Chapter 4.5.3 --- Cost Function --- p.70Chapter 4.5.4 --- Annealing Schedule --- p.71Chapter 4.6 --- Experimental Results --- p.71Chapter 4.7 --- Summary --- p.72Chapter 5 --- Conclusion --- p.76Bibliography --- p.8

    Temperature-Aware Design and Management for 3D Multi-Core Architectures

    Get PDF
    Vertically-integrated 3D multiprocessors systems-on-chip (3D MPSoCs) provide the means to continue integrating more functionality within a unit area while enhancing manufacturing yields and runtime performance. However, 3D MPSoCs incur amplified thermal challenges that undermine the corresponding reliability. To address these issues, several advanced cooling technologies, alongside temperature-aware design-time optimizations and run-time management schemes have been proposed. In this monograph, we provide an overall survey on the recent advances in temperature-aware 3D MPSoC considerations. We explore the recent advanced cooling strategies, thermal modeling frameworks, design-time optimizations and run-time thermal management schemes that are primarily targeted for 3D MPSoCs. Our aim of proposing this survey is to provide a global perspective, highlighting the advancements and drawbacks on the recent state-of-the-ar

    Fixed-outline bus-driven floorplanning.

    Get PDF
    Jiang, Yan.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 87-92).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Physical Design --- p.2Chapter 1.2 --- Floorplanning --- p.6Chapter 1.2.1 --- Floorplanning Objectives --- p.7Chapter 1.2.2 --- Common Approaches --- p.8Chapter 1.3 --- Motivations and Contributions --- p.14Chapter 1.4 --- Organization of the Thesis --- p.15Chapter 2 --- Literature Review on BDF --- p.17Chapter 2.1 --- Zero-Bend BDF --- p.17Chapter 2.1.1 --- BDF Using the Sequence-Pair Representation --- p.17Chapter 2.1.2 --- Using B*-Tree and Fast SA --- p.20Chapter 2.2 --- Two-Bend BDF --- p.22Chapter 2.3 --- TCG-Based Multi-Bend BDF --- p.25Chapter 2.3.1 --- Placement Constraints for Bus --- p.26Chapter 2.3.2 --- Bus Ordering --- p.28Chapter 2.4 --- Bus-Pin-Aware BDF --- p.30Chapter 2.5 --- Summary --- p.33Chapter 3 --- Fixed-Outline BDF --- p.35Chapter 3.1 --- Introduction --- p.35Chapter 3.2 --- Problem Formulation --- p.36Chapter 3.3 --- The Overview of Our Approach --- p.36Chapter 3.4 --- Partitioning --- p.37Chapter 3.4.1. --- The Overview of Partitioning --- p.38Chapter 3.4.2 --- Building a Hypergraph G --- p.39Chapter 3.5 --- Floorplaiining with Bus Routing --- p.43Chapter 3.5.1 --- Find Bus Routes --- p.43Chapter 3.5.2 --- Realization of Bus Routes --- p.48Chapter 3.5.3 --- Details of the Annealing Process --- p.50Chapter 3.6 --- Handle Fixed-Outline Constraints --- p.52Chapter 3.7 --- Bus Layout --- p.52Chapter 3.8 --- Experimental Results --- p.56Chapter 3.9 --- Summary --- p.61Chapter 4 --- Fixed-Outline BDF with L-shape bus --- p.63Chapter 4.1 --- Introduction --- p.63Chapter 4.2 --- Problem Formulation --- p.64Chapter 4.3 --- Our Approach --- p.65Chapter 4.3.1 --- Bus Routability Checking --- p.67Chapter 4.3.2 --- Details of the Annealing Process --- p.79Chapter 4.4 --- Experimental Results --- p.79Chapter 4.5 --- Summary --- p.82Chapter 5 --- Conclusion --- p.85Bibliography --- p.9

    High-Performance Placement and Routing for the Nanometer Scale.

    Full text link
    Modern semiconductor manufacturing facilitates single-chip electronic systems that only five years ago required ten to twenty chips. Naturally, design complexity has grown within this period. In contrast to this growth, it is becoming common in the industry to limit design team size which places a heavier burden on design automation tools. Our work identifies new objectives, constraints and concerns in the physical design of systems-on-chip, and develops new computational techniques to address them. In addition to faster and more relevant design optimizations, we demonstrate that traditional design flows based on ``separation of concerns'' produce unnecessarily suboptimal layouts. We develop new integrated optimizations that streamline traditional chains of loosely-linked design tools. In particular, we bridge the gap between mixed-size placement and routing by updating the objective of global and detail placement to a more accurate estimate of routed wirelength. To this we add sophisticated whitespace allocation, and the combination provides increased routability, faster routing, shorter routed wirelength, and the best via counts of published techniques. To further improve post-routing design metrics, we present new global routing techniques based on Discrete Lagrange Multipliers (DLM) which produce the best routed wirelength results on recent benchmarks. Our work culminates in the integration of our routing techniques within an incremental placement flow to improve detailed routing solutions, shrink die sizes and reduce total chip cost. Not only do our techniques improve the quality and cost of designs, but also simplify design automation software implementation in many cases. Ultimately, we reduce the time needed for design closure through improved tool fidelity and the use of our incremental techniques for placement and routing.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/64639/1/royj_1.pd

    Cross-layer design of thermally-aware 2.5D systems

    Full text link
    Over the past decade, CMOS technology scaling has slowed down. To sustain the historic performance improvement predicted by Moore's Law, in the mid-2000s the computing industry moved to using manycore systems and exploiting parallelism. The on-chip power densities of manycore systems, however, continued to increase after the breakdown of Dennard's Scaling. This leads to the `dark silicon' problem, whereby not all cores can operate at the highest frequency or can be turned on simultaneously due to thermal constraints. As a result, we have not been able to take full advantage of the parallelism in manycore systems. One of the 'More than Moore' approaches that is being explored to address this problem is integration of diverse functional components onto a substrate using 2.5D integration technology. 2.5D integration provides opportunities to exploit chiplet placement flexibility to address the dark silicon problem and mitigate the thermal stress of today's high-performance systems. These opportunities can be leveraged to improve the overall performance of the manycore heterogeneous computing systems. Broadly, this thesis aims at designing thermally-aware 2.5D systems. More specifically, to address the dark silicon problem of manycore systems, we first propose a single-layer thermally-aware chiplet organization methodology for homogeneous 2.5D systems. The key idea is to strategically insert spacing between the chiplets of a 2.5D manycore system to lower the operating temperature, and thus reclaim dark silicon by allowing more active cores and/or higher operating frequency under a temperature threshold. We investigate manufacturing cost and thermal behavior of 2.5D systems, then formulate and solve an optimization problem that jointly maximizes performance and minimizes manufacturing cost. We then enhance our methodology by incorporating a cross-layer co-optimization approach. We jointly maximize performance and minimize manufacturing cost and operating temperature across logical, physical, and circuit layers. We propose a novel gas-station link design that enables pipelining in passive interposers. We then extend our thermally-aware optimization methodology for network routing and chiplet placement of heterogeneous 2.5D systems, which consist of central processing unit (CPU) chiplets, graphics processing unit (GPU) chiplets, accelerator chiplets, and/or memory stacks. We jointly minimize the total wirelength and the system temperature. Our enhanced methodology increases the thermal design power budget and thereby improves thermal-constraint performance of the system

    Parametric Yield of VLSI Systems under Variability: Analysis and Design Solutions

    Get PDF
    Variability has become one of the vital challenges that the designers of integrated circuits encounter. variability becomes increasingly important. Imperfect manufacturing process manifest itself as variations in the design parameters. These variations and those in the operating environment of VLSI circuits result in unexpected changes in the timing, power, and reliability of the circuits. With scaling transistor dimensions, process and environmental variations become significantly important in the modern VLSI design. A smaller feature size means that the physical characteristics of a device are more prone to these unaccounted-for changes. To achieve a robust design, the random and systematic fluctuations in the manufacturing process and the variations in the environmental parameters should be analyzed and the impact on the parametric yield should be addressed. This thesis studies the challenges and comprises solutions for designing robust VLSI systems in the presence of variations. Initially, to get some insight into the system design under variability, the parametric yield is examined for a small circuit. Understanding the impact of variations on the yield at the circuit level is vital to accurately estimate and optimize the yield at the system granularity. Motivated by the observations and results, found at the circuit level, statistical analyses are performed, and solutions are proposed, at the system level of abstraction, to reduce the impact of the variations and increase the parametric yield. At the circuit level, the impact of the supply and threshold voltage variations on the parametric yield is discussed. Here, a design centering methodology is proposed to maximize the parametric yield and optimize the power-performance trade-off under variations. In addition, the scaling trend in the yield loss is studied. Also, some considerations for design centering in the current and future CMOS technologies are explored. The investigation, at the circuit level, suggests that the operating temperature significantly affects the parametric yield. In addition, the yield is very sensitive to the magnitude of the variations in supply and threshold voltage. Therefore, the spatial variations in process and environmental variations make it necessary to analyze the yield at a higher granularity. Here, temperature and voltage variations are mapped across the chip to accurately estimate the yield loss at the system level. At the system level, initially the impact of process-induced temperature variations on the power grid design is analyzed. Also, an efficient verification method is provided that ensures the robustness of the power grid in the presence of variations. Then, a statistical analysis of the timing yield is conducted, by taking into account both the process and environmental variations. By considering the statistical profile of the temperature and supply voltage, the process variations are mapped to the delay variations across a die. This ensures an accurate estimation of the timing yield. In addition, a method is proposed to accurately estimate the power yield considering process-induced temperature and supply voltage variations. This helps check the robustness of the circuits early in the design process. Lastly, design solutions are presented to reduce the power consumption and increase the timing yield under the variations. In the first solution, a guideline for floorplaning optimization in the presence of temperature variations is offered. Non-uniformity in the thermal profiles of integrated circuits is an issue that impacts the parametric yield and threatens chip reliability. Therefore, the correlation between the total power consumption and the temperature variations across a chip is examined. As a result, floorplanning guidelines are proposed that uses the correlation to efficiently optimize the chip's total power and takes into account the thermal uniformity. The second design solution provides an optimization methodology for assigning the power supply pads across the chip for maximizing the timing yield. A mixed-integer nonlinear programming (MINLP) optimization problem, subject to voltage drop and current constraint, is efficiently solved to find the optimum number and location of the pads

    Automatic synthesis and optimization of chip multiprocessors

    Get PDF
    The microprocessor technology has experienced an enormous growth during the last decades. Rapid downscale of the CMOS technology has led to higher operating frequencies and performance densities, facing the fundamental issue of power dissipation. Chip Multiprocessors (CMPs) have become the latest paradigm to improve the power-performance efficiency of computing systems by exploiting the parallelism inherent in applications. Industrial and prototype implementations have already demonstrated the benefits achieved by CMPs with hundreds of cores.CMP architects are challenged to take many complex design decisions. Only a few of them are:- What should be the ratio between the core and cache areas on a chip?- Which core architectures to select?- How many cache levels should the memory subsystem have?- Which interconnect topologies provide efficient on-chip communication?These and many other aspects create a complex multidimensional space for architectural exploration. Design Automation tools become essential to make the architectural exploration feasible under the hard time-to-market constraints. The exploration methods have to be efficient and scalable to handle future generation on-chip architectures with hundreds or thousands of cores.Furthermore, once a CMP has been fabricated, the need for efficient deployment of the many-core processor arises. Intelligent techniques for task mapping and scheduling onto CMPs are necessary to guarantee the full usage of the benefits brought by the many-core technology. These techniques have to consider the peculiarities of the modern architectures, such as availability of enhanced power saving techniques and presence of complex memory hierarchies.This thesis has several objectives. The first objective is to elaborate the methods for efficient analytical modeling and architectural design space exploration of CMPs. The efficiency is achieved by using analytical models instead of simulation, and replacing the exhaustive exploration with an intelligent search strategy. Additionally, these methods incorporate high-level models for physical planning. The related contributions are described in Chapters 3, 4 and 5 of the document.The second objective of this work is to propose a scalable task mapping algorithm onto general-purpose CMPs with power management techniques, for efficient deployment of many-core systems. This contribution is explained in Chapter 6 of this document.Finally, the third objective of this thesis is to address the issues of the on-chip interconnect design and exploration, by developing a model for simultaneous topology customization and deadlock-free routing in Networks-on-Chip. The developed methodology can be applied to various classes of the on-chip systems, ranging from general-purpose chip multiprocessors to application-specific solutions. Chapter 7 describes the proposed model.The presented methods have been thoroughly tested experimentally and the results are described in this dissertation. At the end of the document several possible directions for the future research are proposed
    • …
    corecore