24 research outputs found

    Interconnect performance estimation models for design planning

    Full text link

    Scalability and interconnection issues in floorplan design and floorplan representations.

    Get PDF
    Yuen Wing-seung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves [116]-[122]).Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.iiiList of Figures --- p.viiiList of Tables --- p.xiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivations and Aims --- p.1Chapter 1.2 --- Contributions --- p.3Chapter 1.3 --- Dissertation Overview --- p.4Chapter 2 --- Physical Design and Floorplanning in VLSI Circuits --- p.6Chapter 2.1 --- VLSI Design Flow --- p.6Chapter 2.2 --- Floorplan Design --- p.8Chapter 2.2.1 --- Problem Formulation --- p.9Chapter 2.2.2 --- Types of Floorplan --- p.10Chapter 3 --- Floorplanning Representations --- p.12Chapter 3.1 --- Polish Expression(PE) [WL86] --- p.12Chapter 3.2 --- Bounded-Sliceline-Grid(BSG) [NFMK96] --- p.14Chapter 3.3 --- Sequence Pair(SP) [MFNK95] --- p.17Chapter 3.4 --- O-tree(OT) [GCY99] --- p.19Chapter 3.5 --- B*-tree(BT) [CCWW00] --- p.21Chapter 3.6 --- Corner Block List(CBL) [HHC+00] --- p.22Chapter 4 --- Optimization Technique in Floorplan Design --- p.27Chapter 4.1 --- General Optimization Methods --- p.27Chapter 4.1.1 --- Simulated Annealing --- p.27Chapter 4.1.2 --- Genetic Algorithm --- p.29Chapter 4.1.3 --- Integer Programming Method --- p.31Chapter 4.2 --- Shape Optimization --- p.33Chapter 4.2.1 --- Shape Curve --- p.33Chapter 4.2.2 --- Lagrangian Relaxation --- p.34Chapter 5 --- Literature Review on Interconnect Driven Floorplanning --- p.37Chapter 5.1 --- Placement Constraint in Floorplan Design --- p.37Chapter 5.1.1 --- Boundary Constraints --- p.37Chapter 5.1.2 --- Pre-placed Constraints --- p.39Chapter 5.1.3 --- Range Constraints --- p.41Chapter 5.1.4 --- Symmetry Constraints --- p.42Chapter 5.2 --- Timing Analysis Method --- p.43Chapter 5.3 --- Buffer Block Planning and Congestion Control --- p.45Chapter 5.3.1 --- Buffer Block Planning --- p.45Chapter 5.3.2 --- Congestion Control --- p.50Chapter 6 --- Clustering Constraint in Floorplan Design --- p.53Chapter 6.1 --- Problem Definition --- p.53Chapter 6.2 --- Overview --- p.54Chapter 6.3 --- Locating Neighboring Modules --- p.56Chapter 6.4 --- Constraint Satisfaction --- p.62Chapter 6.5 --- Multi-clustering Extension --- p.64Chapter 6.6 --- Cost Function --- p.64Chapter 6.7 --- Experimental Results --- p.65Chapter 7 --- Interconnect Driven Multilevel Floorplanning Approach --- p.69Chapter 7.1 --- Multilevel Partitioning --- p.69Chapter 7.1.1 --- Coarsening Phase --- p.70Chapter 7.1.2 --- Refinement Phase --- p.70Chapter 7.2 --- Overview of Multilevel Floorplanner --- p.72Chapter 7.3 --- Clustering Phase --- p.73Chapter 7.3.1 --- Clustering Methods --- p.73Chapter 7.3.2 --- Area Ratio Constraints --- p.75Chapter 7.3.3 --- Clustering Velocity --- p.76Chapter 7.4 --- Refinement Phase --- p.77Chapter 7.4.1 --- Temperature Control --- p.79Chapter 7.4.2 --- Cost Function --- p.80Chapter 7.4.3 --- Handling Shape Flexibility --- p.80Chapter 7.5 --- Experimental Results --- p.81Chapter 7.5.1 --- Data Set Generation --- p.82Chapter 7.5.2 --- Temperature Control --- p.82Chapter 7.5.3 --- Packing Results --- p.83Chapter 8 --- Study of Non-slicing Floorplan Representations --- p.89Chapter 8.1 --- Analysis of Different Floorplan Representations --- p.89Chapter 8.1.1 --- Complexity --- p.90Chapter 8.1.2 --- Types of Floorplans --- p.92Chapter 8.2 --- T-junction Orientation Property --- p.97Chapter 8.3 --- Twin Binary Tree Representation for Mosaic Floorplan --- p.103Chapter 8.3.1 --- Previous work --- p.103Chapter 8.3.2 --- Twin Binary Tree Construction --- p.105Chapter 8.3.3 --- Floorplan Construction --- p.109Chapter 9 --- Conclusion --- p.114Chapter 9.1 --- Summary --- p.114Bibliography --- p.116Chapter A --- Clustering Constraint Data Set --- p.123Chapter A.1 --- ami33 --- p.123Chapter A.1.1 --- One cluster --- p.123Chapter A.1.2 --- Multi-cluster --- p.123Chapter A.2 --- ami49 --- p.124Chapter A.2.1 --- One cluster --- p.124Chapter A.2.2 --- Multi-cluster --- p.124Chapter A.3 --- playout --- p.124Chapter A.3.1 --- One cluster --- p.124Chapter A.3.2 --- Multi-cluster --- p.125Chapter B --- Multilevel Data Set --- p.126Chapter B.l --- data_100 --- p.126Chapter B.2 --- data_200 --- p.127Chapter B.3 --- data_300 --- p.129Chapter B.4 --- data_400 --- p.131Chapter B.5 --- data_500 --- p.13

    Design, Extraction, and Optimization Tool Flows and Methodologies for Homogeneous and Heterogeneous Multi-Chip 2.5D Systems

    Get PDF
    Chip and packaging industries are making significant progress in 2.5D design as a result of increasing popularity of their application. In advanced high-density 2.5D packages, package redistribution layers become similar to chip Back-End-of-Line routing layers, and the gap between them scales down with pin density improvement. Chiplet-package interactions become significant and severely affect system performance and reliability. Moreover, 2.5D integration offers opportunities to apply novel design techniques. The traditional die-by-die design approach neither carefully considers these interactions nor fully exploits the cross-boundary design opportunities. This thesis presents chiplet-package cross-boundary design, extraction, analysis, and optimization tool flows and methodologies for high-density 2.5D packaging technologies. A holistic flow is presented that can capture all parasitics from chiplets and the package and improve system performance through iterative optimizations. Several design techniques are demonstrated for agile development and quick turn-around time. To validate the flow in silicon, a chip was taped out and studied in TSMC 65nm technology. As the holistic flow cannot handle heterogeneous technologies, in-context flows are presented. Three different flavors of the in-context flow are presented, which offer trade-offs between scalability and accuracy in heterogeneous 2.5D system designs. Inductance is an inseparable part of a package design. A holistic flow is presented that takes package inductance into account in timing analysis and optimization steps. Custom CAD tools are developed to make these flows compatible with the industry standard tools and the foundry model. To prove the effectiveness of the flows several design cases of an ARM Cortex-M0 are implemented for comparitive study

    Physical design methodologies for monolithic 3D ICs

    Get PDF
    The objective of this research is to develop physical design methodologies for monolithic 3D ICs and use them to evaluate the improvements in the power-performance envelope offered over 2D ICs. In addition, design-for-test (DfT) techniques essential for the adoption of shorter term through-silicon-via (TSV) based 3D ICs are explored. Testing of TSV-based 3D ICs is one of the last challenges facing their commercialization. First, a pre-bond testable 3D scan chain construction technique is developed. Next, a transition-delay-fault test architecture is presented, along with a study on how to mitigate IR-drop. Finally, to facilitate partitioning, a quick and accurate framework for test-TSV estimation is developed. Block-level monolithic 3D ICs will be the first to emerge, as significant IP can be reused. However, no physical design flows exist, and hence a monolithic 3D floorplanning framework is developed. Next, inter-tier performance differences that arise due to the not yet mature fabrication process are investigated and modeled. Finally, an inter-tier performance-difference aware floorplanner is presented, and it is demonstrated that high quality 3D floorplans are achievable even under these inter-tier differences. Monolithic 3D offers sufficient integration density to place individual gates in three dimensions and connect them together. However, no tools or techniques exist that can take advantage of the high integration density offered. Therefore, a gate-level framework that leverages existing 2D ICs tools is presented. This framework also provides congestion modeling and produces results that minimize routing congestion. Next, this framework is extended to commercial 2D IC tools, so that steps such as timing optimization and clock tree synthesis can be applied. Finally, a voltage-drop-aware partitioning technique is presented that can alleviate IR-drop issues, without any impact on the performance or maximum operating temperature of the chip.Ph.D

    High-Performance Fpaa Design For Hierarchical Implementation Of Analog And Mixed-Signal Systems

    Get PDF
    The design complexity of today's IC has increased dramatically due to the high integration allowed by advanced CMOS VLSI process. A key to manage the increased design complexity while meeting the shortening time-to-market is design automation. In digital world, the field-programmable gate arrays (FPGAs) have evolved to play a very important role by providing ASIC-compatible design methodologies that include design-for-testability, design optimization and rapid prototyping. On the analog side, the drive towards shorter design cycles has demanded the development of high performance analog circuits that are configurable and suitable for CAD methodologies. Field-programmable analog arrays (FPAAs) are intended to achieve the benefits for analog system design as FPGAs have in the digital field. Despite of the obvious advantages of hierarchical analog design, namely short time-to-market and low non-recurring engineering (NRE) costs, this approach has some apparent disadvantages. The redundant devices and routing resources for programmability requires extra chip area, while switch and interconnect parasitics cause considerable performance degradation. To deliver a high-performance FPAA, effective methodologies must be developed to minimize those adversary effects. In this dissertation, three important aspects in the FPAA design are studied to achieve that goal: the programming technology, the configurable analog block (CAB) design and the routing architecture design. Enabled by the Laser MakelinkTM technology, which provides nearly ideal programmable switches, channel segmentation algorithms are developed to improve channel routability and reduce interconnect parasitics. Segmented routing are studied and performance metrics accounting for interconnect parasitics are proposed for performance-driven analog routing. For large scale arrays, buffer insertions are considered to further reduce interconnection delay and cross-coupling noise. A high-performance, highly flexible CAB is developed to realized both continuous-mode and switched-capacitor circuits. In the end, the implementation of an 8-bit, 50MSPS pipelined A/D converter using the proposed FPAA is presented as an example of the hierarchical analog design approach, with its key performance specifications discussed

    Structure discovery techniques for circuit design and process model visualization

    Get PDF
    Graphs are one of the most used abstractions in many knowledge fields because of the easy and flexibility by which graphs can represent relationships between objects. The pervasiveness of graphs in many disciplines means that huge amounts of data are available in graph form, allowing many opportunities for the extraction of useful structure from these graphs in order to produce insight into the data. In this thesis we introduce a series of techniques to resolve well-known challenges in the areas of digital circuit design and process mining. The underlying idea that ties all the approaches together is discovering structures in graphs. We show how many problems of practical importance in these areas can be solved utilizing both common and novel structure mining approaches. In the area of digital circuit design, this thesis proposes automatically discovering frequent, repetitive structures in a circuit netlist in order to improve the quality of physical planning. These structures can be used during floorplanning to produce regular designs, which are known to be highly efficient and economical. At the same time, detecting these repeating structures can exponentially reduce the total design time. The second focus of this thesis is in the area of the visualization of process models. Process mining is a recent area of research which centers on studying the behavior of real-life systems and their interactions with the environment. Complicated process models, however, hamper this goal. By discovering the important structures in these models, we propose a series of methods that can derive visualization-friendly process models with minimal loss in accuracy. In addition, and combining the areas of circuit design and process mining, this thesis opens the area of specification mining in asynchronous circuits. Instead of the usual design flow, which involves synthesizing circuits from specifications, our proposal discovers specifications from implemented circuits. This area allows for many opportunities for verification and re-synthesis of asynchronous circuits. The proposed methods have been tested using real-life benchmarks, and the quality of the results compared to the state-of-the-art.Els grafs són una de les representacions abstractes més comuns en molts camps de recerca, gràcies a la facilitat i flexibilitat amb la que poden representar relacions entre objectes. Aquesta popularitat fa que una gran quantitat de dades es puguin trobar en forma de graf, i obre moltes oportunitats per a extreure estructures d'aquest grafs, útils per tal de donar una intuïció millor de les dades subjacents. En aquesta tesi introduïm una sèrie de tècniques per resoldre reptes habitualment trobats en les àrees de disseny de circuits digitals i mineria de processos industrials. La idea comú sota tots els mètodes proposats es descobrir automàticament estructures en grafs. En la tesi es mostra que molts problemes trobats a la pràctica en aquestes àrees poden ser resolts utilitzant nous mètodes de descobriment d'estructures. En l'àrea de disseny de circuits, proposem descobrir, automàticament, estructures freqüents i repetitives en les definicions del circuit per tal de millorar la qualitat de les etapes posteriors de planificació física. Les estructures descobertes poden fer-se servir durant la planificació per produir dissenys regulars, que son molt més econòmics d'implementar. Al mateix temps, la descoberta i ús d'aquestes estructures pot reduir exponencialment el temps total de disseny. El segon punt focal d'aquesta tesi és en l'àrea de la visualització de models de processos industrials. La mineria de processos industrials es un tema jove de recerca que es centra en estudiar el comportament de sistemes reals i les interaccions d'aquests sistemes amb l'entorn. No obstant, quan d'aquest anàlisi s'obtenen models massa complexos visualment, l'estudi n'és problemàtic. Proposem una sèrie de mètodes que, gràcies al descobriment automàtic de les estructures més importants, poden generar models molt més fàcils de visualitzar que encara descriuen el comportament del sistema amb gran precisió. Combinant les àrees de disseny de circuits i mineria de processos, aquesta tesi també obre un nou tema de recerca: la mineria d'especificacions per circuits asíncrons. En l'estil de disseny asíncron habitual, sintetitzadors automàtics generen circuits a partir de les especificacions. En aquesta tesi proposem el pas invers: descobrir automàticament les especificacions de circuits ja implementats. Així, creem noves oportunitats per a la verificació i la re-síntesi de circuits asíncrons. Els mètodes proposats en aquesta tesi s'han validat fent servir dades obtingudes d'aplicacions pràctiques, i en comparem els resultats amb els mètodes existents

    Algorithms for the scaling toward nanometer VLSI physical synthesis

    Get PDF
    Along the history of Very Large Scale Integration (VLSI), we have successfully scaled down the size of transistors, scaled up the speed of integrated circuits (IC) and the number of transistors in a chip - these are just a few examples of our achievement in VLSI scaling. It is projected to enter the nanometer (timing estimation and buffer planning for global routing and other early stages such as floorplanning. A novel path based buffer insertion scheme is also included, which can overcome the weakness of the net based approaches. Part-2 Circuit clustering techniques with the application in Field-Programmable Gate Array (FPGA) technology mapping The problem of timing driven n-way circuit partitioning with application to FPGA technology mapping is studied and a hierarchical clustering approach is presented for the latest multi-level FPGA architectures. Moreover, a more general delay model is included in order to accurately characterize the delay behavior of the clusters and circuit elements

    Resource and thermal management in 3D-stacked multi-/many-core systems

    Full text link
    Continuous semiconductor technology scaling and the rapid increase in computational needs have stimulated the emergence of multi-/many-core processors. While up to hundreds of cores can be placed on a single chip, the performance capacity of the cores cannot be fully exploited due to high latencies of interconnects and memory, high power consumption, and low manufacturing yield in traditional (2D) chips. 3D stacking is an emerging technology that aims to overcome these limitations of 2D designs by stacking processor dies over each other and using through-silicon-vias (TSVs) for on-chip communication, and thus, provides a large amount of on-chip resources and shortens communication latency. These benefits, however, are limited by challenges in high power densities and temperatures. 3D stacking also enables integrating heterogeneous technologies into a single chip. One example of heterogeneous integration is building many-core systems with silicon-photonic network-on-chip (PNoC), which reduces on-chip communication latency significantly and provides higher bandwidth compared to electrical links. However, silicon-photonic links are vulnerable to on-chip thermal and process variations. These variations can be countered by actively tuning the temperatures of optical devices through micro-heaters, but at the cost of substantial power overhead. This thesis claims that unearthing the energy efficiency potential of 3D-stacked systems requires intelligent and application-aware resource management. Specifically, the thesis improves energy efficiency of 3D-stacked systems via three major components of computing systems: cache, memory, and on-chip communication. We analyze characteristics of workloads in computation, memory usage, and communication, and present techniques that leverage these characteristics for energy-efficient computing. This thesis introduces 3D cache resource pooling, a cache design that allows for flexible heterogeneity in cache configuration across a 3D-stacked system and improves cache utilization and system energy efficiency. We also demonstrate the impact of resource pooling on a real prototype 3D system with scratchpad memory. At the main memory level, we claim that utilizing heterogeneous memory modules and memory object level management significantly helps with energy efficiency. This thesis proposes a memory management scheme at a finer granularity: memory object level, and a page allocation policy to leverage the heterogeneity of available memory modules and cater to the diverse memory requirements of workloads. On the on-chip communication side, we introduce an approach to limit the power overhead of PNoC in (3D) many-core systems through cross-layer thermal management. Our proposed thermally-aware workload allocation policies coupled with an adaptive thermal tuning policy minimize the required thermal tuning power for PNoC, and in this way, help broader integration of PNoC. The thesis also introduces techniques in placement and floorplanning of optical devices to reduce optical loss and, thus, laser source power consumption.2018-03-09T00:00:00

    VLSI Design

    Get PDF
    This book provides some recent advances in design nanometer VLSI chips. The selected topics try to present some open problems and challenges with important topics ranging from design tools, new post-silicon devices, GPU-based parallel computing, emerging 3D integration, and antenna design. The book consists of two parts, with chapters such as: VLSI design for multi-sensor smart systems on a chip, Three-dimensional integrated circuits design for thousand-core processors, Parallel symbolic analysis of large analog circuits on GPU platforms, Algorithms for CAD tools VLSI design, A multilevel memetic algorithm for large SAT-encoded problems, etc
    corecore