281 research outputs found
Throughput-driven floorplanning with wire pipelining
The size of future high-performance SoC is such that the time-of-flight of wires connecting distant pins in the layout can be much higher than the clock period. In order to keep the frequency as high as possible, the wires may be pipelined. However, the insertion of flip-flops may alter the throughput of the system due to the presence of loops in the logic netlist. In this paper, we address the problem of floorplanning a large design where long interconnects are pipelined by inserting the throughput in the cost function of a tool based on simulated annealing. The results obtained on a series of benchmarks are then validated using a simple router that breaks long interconnects by suitably placing flip-flops along the wires
Recommended from our members
Cross-Layer Pathfinding for Off-Chip Interconnects
Off-chip interconnects for integrated circuits (ICs) today induce a diverse design space, spanning many different applications that require transmission of data at various bandwidths, latencies and link lengths. Off-chip interconnect design solutions are also variously sensitive to system performance, power and cost metrics, while also having a strong impact on these metrics. The costs associated with off-chip interconnects include die area, package (PKG) and printed circuit board (PCB) area, technology and bill of materials (BOM). Choices made regarding off-chip interconnects are fundamental to product definition, architecture, design implementation and technology enablement. Given their cross-layer impact, it is imperative that a cross-layer approach be employed to architect and analyze off-chip interconnects up front, so that a top-down design flow can comprehend the cross-layer impacts and correctly assess the system performance, power and cost tradeoffs for off-chip interconnects. Chip architects are not exposed to all the tradeoffs at the physical and circuit implementation or technology layers, and often lack the tools to accurately assess off-chip interconnects. Furthermore, the collaterals needed for a detailed analysis are often lacking when the chip is architected; these include circuit design and layout, PKG and PCB layout, and physical floorplan and implementation. To address the need for a framework that enables architects to assess the system-level impact of off-chip interconnects, this thesis presents power-area-timing (PAT) models for off-chip interconnects, optimization and planning tools with the appropriate abstraction using these PAT models, and die/PKG/PCB co-design methods that help expose the off-chip interconnect cross-layer metrics to the die/PKG/PCB design flows. Together, these models, tools and methods enable cross-layer optimization that allows for a top-down definition and exploration of the design space and helps converge on the correct off-chip interconnect implementation and technology choice. The tools presented cover off-chip memory interfaces for mobile and server products, silicon photonic interfaces, 2.5D silicon interposers and 3D through-silicon vias (TSVs). The goal of the cross-layer framework is to assess the key metrics of the interconnect (such as timing, latency, active/idle/sleep power, and area/cost) at an appropriate level of abstraction by being able to do this across layers of the design flow. In additional to signal interconnect, this thesis also explores the need for such cross-layer pathfinding for power distribution networks (PDN), where the system-on-chip (SoC) floorplan and pinmap must be optimized before the collateral layouts for PDN analysis are ready. Altogether, the developed cross-layer pathfinding methodology for off-chip interconnects enables more rapid and thorough exploration of a vast design space of off-chip parallel and serial links, inter-die and inter-chiplet links and silicon photonics. Such exploration will pave the way for off-chip interconnect technology enablement that is optimized for system needs. The basis of the framework can be extended to cover other interconnect technology as well, since it fundamentally relates to system-level metrics that are common to all off-chip interconnects
Physical Design Methodologies for Low Power and Reliable 3D ICs
As the semiconductor industry struggles to maintain its momentum down the path following the Moore's Law, three dimensional integrated circuit (3D IC) technology has emerged as a promising solution to achieve higher integration density, better performance, and lower power consumption. However, despite its significant improvement in electrical performance, 3D IC presents several serious physical design challenges. In this dissertation, we investigate physical design methodologies for 3D ICs with primary focus on two areas: low power 3D clock tree design, and reliability degradation modeling and management.
Clock trees are essential parts for digital system which dissipate a large amount of power due to high capacitive loads. The majority of existing 3D clock tree designs focus on minimizing the total wire length, which produces sub-optimal results for power optimization. In this dissertation, we formulate a 3D clock tree design flow which directly optimizes for clock power. Besides, we also investigate the design methodology for clock gating a 3D clock tree, which uses shutdown gates to selectively turn off unnecessary clock activities. Different from the common assumption in 2D ICs that shutdown gates are cheap thus can be applied at every clock node, shutdown gates in 3D ICs introduce additional control TSVs, which compete with clock TSVs for placement resources. We explore the design methodologies to produce the optimal allocation and placement for clock and control TSVs so that the clock power is minimized. We show that the proposed synthesis flow saves significant clock power while accounting for available TSV placement area.
Vertical integration also brings new reliability challenges including TSV's electromigration (EM) and several other reliability loss mechanisms caused by TSV-induced stress. These reliability loss models involve complex inter-dependencies between electrical and thermal conditions, which have not been investigated in the past. In this dissertation we set up an electrical/thermal/reliability co-simulation framework to capture the transient of reliability loss in 3D ICs. We further derive and validate an analytical reliability objective function that can be integrated into the 3D placement design flow. The reliability aware placement scheme enables co-design and co-optimization of both the electrical and reliability property, thus improves both the circuit's performance and its lifetime. Our electrical/reliability co-design scheme avoids unnecessary design cycles or application of ad-hoc fixes that lead to sub-optimal performance.
Vertical integration also enables stacking DRAM on top of CPU, providing high bandwidth and short latency. However, non-uniform voltage fluctuation and local thermal hotspot in CPU layers are coupled into DRAM layers, causing a non-uniform bit-cell leakage (thereby bit flip) distribution. We propose a performance-power-resilience simulation framework to capture DRAM soft error in 3D multi-core CPU systems. In addition, a dynamic resilience management (DRM) scheme is investigated, which adaptively tunes CPU's operating points to adjust DRAM's voltage noise and thermal condition during runtime. The DRM uses dynamic frequency scaling to achieve a resilience borrow-in strategy, which effectively enhances DRAM's resilience without sacrificing performance.
The proposed physical design methodologies should act as important building blocks for 3D ICs and push 3D ICs toward mainstream acceptance in the near future
High performance algorithms for large scale placement problem
Placement is one of the most important problems in electronic design automation (EDA). An inferior placement solution will not only affect the chip’s performance but might also make it nonmanufacturable by producing excessive wirelength, which is beyond available routing resources. Although placement has been extensively investigated for several decades, it is still a very challenging problem mainly due to that design scale has been dramatically increased by order of magnitudes and the increasing trend seems unstoppable. In modern design, chips commonly integrate millions of gates that require over tens of metal routing layers. Besides, new manufacturing techniques bring out new requests leading to that multi-objectives should be optimized simultaneously during placement.
Our research provides high performance algorithms for placement problem. We propose (i) a high performance global placement core engine POLAR; (ii) an efficient routability-driven placer POLAR 2.0, which is an extension of POLAR to deal with routing congestion; (iii) an ultrafast global placer POLAR 3.0, which explore parallelism on POLAR and can make full use of multi-core system; (iv) some efficient triple patterning lithography (TPL) aware detailed placement algorithms
Circuit delay optimization by buffering the logic gates
Avec la miniaturisation actuelle, les circuits démontrent de plus en plus l'importance des délais d'interconnexion. Afin de réduire ce délai, l'insertion de tampons doit être effectuée durant la synthèse logique et la synthèse physique. Cette activité d'optimisation est souvent basée sur la programmation dynamique. Dans ce mémoire, la technique branch-and-bound est utilisé et le problème pour le cas spécifique d'arbres de tampons équilibrés est résolu, où toutes les charges ont un temps requis et une capacité identique. Une analyse mathématique est faite pour tenir compte d'une variété de questions de conception telles que la topologie, la bibliothèque de tampons et le changement de phase en présence d'inverseur. En combinant la programmation dynamique et les techniques branch-and-bound, une méthode hybride est présentée qui améliore le temps d'exécution tout en conservant une utilisation de mémoire raisonnable. Les concepts mathématiques et algorithmiques fondamentaux utilisés dans ce mémoire peuvent être employés pour généraliser la méthode proposée pour un ensemble de charges avec des capacités et des temps requis différents
FieldPlacer - A flexible, fast and unconstrained force-directed placement method for heterogeneous reconfigurable logic architectures
The field of placement methods for components of integrated circuits, especially in the domain of reconfigurable chip architectures, is mainly dominated by a handful of concepts. While some of these are easy to apply but difficult to adapt to new situations, others are more flexible but rather complex to realize.
This work presents the FieldPlacer framework, a flexible, fast and unconstrained force-directed placement method for heterogeneous reconfigurable logic architectures, in particular for the ever important heterogeneous FPGAs.
In contrast to many other force-directed placers, this approach is called ‘unconstrained’ as it does not require a priori fixed logic elements in order to calculate a force equilibrium as the solution to a system of equations. Instead, it is based on a free spring embedder simulation of a graph representation which includes all logic block types of a design simultaneously. The FieldPlacer framework offers a huge amount of flexibility in applying different distance norms (e. g., the Manhattan distance) for the force-directed layout and aims at creating adapted layouts for various objective functions, e. g., highest performance or improved routability. Depending on the individual situation, a runtime-quality trade-off can be considered to either produce a decent placement in a very short time or to generate an exceptionally good placement, which takes longer.
An extensive comparison with the latest simulated annealing placement method from the well-known Versatile Place and Route (VPR) framework shows that the FieldPlacer approach can create placements of comparable quality much faster than VPR or, alternatively, generate better placements in the same time. The flexibility in defining arbitrary objective functions and the intuitive adaptability of the method, which, among others, includes different concepts from the field of graph drawing, should facilitate further developments with this framework, e. g., for new upcoming optimization targets like the energy consumption of an implemented design
Synthèse de réseaux de distribution d'horloges en présence de variations du procédé de fabrication
Design of clock distributions networks in presence of process variations -- Importance des variations spatiales de la constante de temps du transistor MOS -- Pipelined H-trees for high-speed clocking of large integrated systems in presence of process variations -- Conception de réseaux de distribution d'horloges fiables et à faible consommation de puissance -- Design of low-power and reliable logic-based H-trees -- Sources des variations spatiales de la constante de temps du transistor MOS -- Spatial characterization of process variations via MOS transistor time constants in VLSI & WSI -- Techniques de minimisation du biais de synchronisation par calibration de délai -- Minimizing process-induced skew using delay tuning
Network-on-Chip
Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoC—its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems
- …