30 research outputs found
Timing optimization during the physical synthesis of cell-based VLSI circuits
Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2016.Abstract : The evolution of CMOS technology made possible integrated circuits with billions of transistors assembled into a single silicon chip, giving rise to the jargon Very-Large-Scale Integration (VLSI). The required clock frequency affects the performance of a VLSI circuit and induces timing constraints that must be properly handled by synthesis tools. During the physical synthesis of VLSI circuits, several optimization techniques are used to iteratively reduce the number of timing violations until the target clock frequency is met. The dramatic increase of interconnect delay under technology scaling represents one of the major challenges for the timing closure of modern VLSI circuits. In this scenario, effective interconnect synthesis techniques play a major role. That is why this thesis targets two timing optimization problems for effective interconnect synthesis: Incremental Timing-Driven Placement (ITDP) and Incremental Timing-Driven Layer Assignment (ITLA). For solving the ITDP problem, this thesis proposes a new Lagrangian Relaxation formulation that minimizes timing violations for both setup and hold timing constraints. This work also proposes a netbased technique that uses Lagrange multipliers as net-weights, which are dynamically updated using an accurate timing analyzer. The netbased technique makes use of a novel discrete search to relocate cells by employing the Euclidean distance to define a proper neighborhood. For solving the ITLA problem, this thesis proposes a network flow approach that handles simultaneously critical and non-critical segments, and exploits a few flow conservation conditions to extract timing information for each net segment individually, thereby enabling the use of an external timing engine. The experimental validation using benchmark suites derived from industrial circuits demonstrates the effectiveness of the proposed techniques when compared with state-of-the-art works.A evolução da tecnologia CMOS viabilizou a fabricação de circuitos integrados contendo bilhões de transistores em uma única pastilha de silício, dando origem ao jargão Very-Large-Scale Integration (VLSI). A frequência-alvo de operação de um circuito VLSI afeta o seu desempenho e induz restrições de timing que devem ser manipuladas pelas ferramentas de síntese. Durante a síntese física de circuitos VLSI, diversas técnicas de otimização são usadas para iterativamente reduzir o número de violações de timing até que a frequência-alvo de operação seja atingida. O aumento dramático do atraso das interconexões devido à evolução tecnológica representa um dos maiores desafios para o fluxo de timing closure de circuitos VLSI contemporâneos. Nesse cenário, técnicas de síntese de interconexão eficientes têm um papel fundamental. Por este motivo, esta tese aborda dois problemas de otimização de timing para uma síntese eficiente das interconexões de um circuito VLSI: Incremental Timing-Driven Placement (ITDP) e Incremental Timing-Driven Layer Assignment (ITLA). Para resolver o problema de ITDP, esta tese propõe uma nova formulação utilizando Relaxação Lagrangeana que tem por objetivo a minimização simultânea das violações de timing para restrições do tipo setup e hold. Este trabalho também propõe uma técnica que utiliza multiplicadores de Lagrange como pesos para as interconexões, os quais são atualizados dinamicamente através dos resultados de uma ferramenta de análise de timing. Tal técnica realoca as células do circuito por meio de uma nova busca discreta que adota a distância Euclidiana como vizinhança.Para resolver o problema de ITLA, esta tese propõe uma abordagem em fluxo em redes que otimiza simultaneamente segmentos críticos e não-críticos, e explora algumas condições de fluxo para extrair as informações de timing para cada segmento individualmente, permitindo assim o uso de uma ferramenta de timing externa. A validação experimental, utilizando benchmarks derivados de circuitos industriais, demonstra a eficiência das técnicas propostas quando comparadas com trabalhos estado da arte
High-performance Global Routing for Trillion-gate Systems-on-Chips.
Due to aggressive transistor scaling, modern-day CMOS circuits have continually increased in both complexity and productivity. Modern semiconductor designs have narrower and more resistive wires, thereby shifting the performance bottleneck to interconnect delay. These trends considerably impact timing closure and call for improvements in high-performance physical design tools to keep pace with the current state of IC innovation.
As leading-edge designs may incorporate tens of millions of gates, algorithm and software scalability are crucial to achieving reasonable turnaround time. Moreover, with decreasing device sizes, optimizing traditional objectives is no longer sufficient.
Our research focuses on (i) expanding the capabilities of standalone global routing, (ii) extending global routing for use in different design applications, and (iii) integrating routing within broader physical design optimizations and flows, e.g., congestion-driven
placement. Our first global router relies on integer-linear programming (ILP), and can solve fairly large problem instances to optimality. Our second iterative global router relies on Lagrangian relaxation, where we relax the routing violation constraints to allowing routing overflow at a penalty. In both approaches, our desire is to give the router the maximum degree of freedom within a specified context. Empirically, both routers produce competitive results within a reasonable amount of runtime. To improve routability, we explore the incorporation of routing with placement, where the router estimates congestion and feeds this information to the placer. In turn, the emphasis on runtime is heightened, as the router will be invoked multiple times. Empirically, our placement-and-route framework significantly improves the final solution’s routability than performing the steps sequentially. To further enhance routability-driven placement, we (i) leverage incrementality to generate fast and accurate congestion maps, and (ii) develop several techniques to relieve cell-based and layout-based congestion. To broaden the scope of routing, we integrate a global router in a chip-design flow that addresses the buffer explosion problem.PHDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/98025/1/jinhu_1.pd
Recommended from our members
Interconnect optimizations for nanometer VLSI design
textAs the semiconductor technology scales into deeper sub-micron domain, billions of transistors can be used on a single system-on-chip (SOC) makes interconnection optimization more important roughly for two reasons. First, congestion, power, timing in routing and buffering requirements make inter- connection optimization more and more challenging. Second, gate delay get- ting shorter while the RC delay gets longer due to scaling. Study of interconnection construction and optimization algorithms in real industry flows and designs ends up with interesting findings. One used to be overlooked but very important and practical problem is how to utilize over- the-block routing resources intelligently. Routing over large IP blocks needs special attention as there is almost no way to insert buffers inside hard IP blocks, which can lead to unsolvable slew/timing violations. In current design flows we have seen, the routing resources over the IP blocks were either dealt as routing blockages leading to a significant waste, or simply treated in the same way as outside-the-block routing resources, which would violate the slew constraints and thus fail buffering. To handle that, this work proposes a novel buffering-aware over-the- block rectilinear Steiner minimum tree (BOB-RSMT) algorithm which helps reclaim the “wasted” over-the-block routing resources while meeting user-specified slew constraints. Proposed algorithm incrementally and efficiently migrates initial tree structures with buffering-awareness to meet slew constraints while minimizing wire-length. Moreover, due to the fact that timing optimization is important for the VLSI design, in this work, timing-driven over-the-block rectilinear Steiner tree (TOB-RST) is also studied to optimize critical paths. This proposed TOB-RST algorithm can be used in routing or post-routing stage to provide high-quality topologies to help close timing. Then a follow-up problem emerges: how to accomplish the whole routing with over-the-block routing resources used properly. Utilizing over-the- block routing resources could dramatically improve the routing solution, yet require special attention, since the slew, affected by different RC on different metal layers, must be constrained by buffering and is easily violated. Moreover, even of all nets are slew-legalized, the routing solution could still suffer from heavy congestion problem. A new global router, BOB-Router, is to solve the over-the-block global routing problem through minimizing overflows, wire-length and via count simultaneously without violating slew constraints. Based on my completed works, BOB-RSMT and BOB-Router tremendously improve the overall routing and buffering quality. Experimental results show that proposed over-the-block rectilinear Steiner tree construction and routing completely satisfies the slew constraints and significantly outperforms the obstacle-avoiding rectilinear Steiner tree construction and routing in terms of wire-length, via count and overflows.Electrical and Computer Engineerin
Enhancing Power Efficient Design Techniques in Deep Submicron Era
Excessive power dissipation has been one of the major bottlenecks for design and
manufacture in the past couple of decades. Power efficient design has become
more and more challenging when technology scales down to the deep submicron era
that features the dominance of leakage, the manufacture variation, the on-chip
temperature variation and higher reliability requirements, among others. Most of the computer aided design (CAD) tools and algorithms currently used in industry
were developed in the pre deep submicron era and did not consider the new features explicitly and adequately.
Recent research advances in deep submicron design, such as the mechanisms of leakage, the source and characterization of manufacture variation, the cause and
models of on-chip temperature variation, provide us the opportunity to incorporate these important issues in power efficient design. We explore this opportunity in this dissertation by demonstrating that significant power reduction can be achieved with only minor modification to the existing CAD tools and algorithms.
First, we consider peak current, which has become critical for circuit's reliability in deep submicron design. Traditional low power design techniques focus on
the reduction of average power. We propose to reduce peak current while keeping the overhead on average power as small as possible. Second, dual Vt technique and gate sizing have been used simultaneously for leakage savings. However, this approach becomes less effective in deep submicron design. We propose to use the newly developed process-induced mechanical stress to enhance its performance.
Finally, in deep submicron design, the impact of on-chip temperature variation on leakage and performance becomes more and more significant. We propose a temperature-aware dual Vt approach to alleviate hot spots and achieve further leakage reduction. We also consider this leakage-temperature dependency in the dynamic voltage scaling approach and discover that a commonly accepted result is incorrect for the current technology.
We conduct extensive experiments with popular design benchmarks, using the latest industry CAD tools and design libraries. The results show that our proposed enhancements are promising in power saving and are practical to solve the low power design challenges in deep submicron era
High-Performance Placement and Routing for the Nanometer Scale.
Modern semiconductor manufacturing facilitates single-chip electronic systems that only five years ago required ten to twenty chips. Naturally, design complexity has grown within this period. In contrast to this growth, it is becoming common in the industry to limit design team size which places a heavier burden on design automation tools.
Our work identifies new objectives, constraints and concerns in the physical design of systems-on-chip, and develops new computational techniques to address them. In addition to faster and more relevant design optimizations, we demonstrate that traditional design flows based on ``separation of concerns'' produce unnecessarily suboptimal layouts. We develop new integrated optimizations that streamline traditional chains of loosely-linked design tools. In particular, we bridge the gap between mixed-size placement and routing by updating the objective of global and detail placement to a more accurate estimate of routed wirelength. To this we add sophisticated whitespace allocation, and the combination provides increased routability, faster routing,
shorter routed wirelength, and the best via counts of published techniques. To further improve post-routing design metrics, we present new global routing techniques based on Discrete Lagrange Multipliers (DLM) which produce the best routed wirelength results on recent benchmarks. Our work culminates in the integration of our routing techniques within an incremental placement flow to
improve detailed routing solutions, shrink die sizes and reduce total chip cost.
Not only do our techniques improve the quality and cost of designs, but also simplify design automation software implementation in many cases. Ultimately, we reduce the time needed for design closure through improved tool fidelity and the use of our incremental techniques for placement and routing.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/64639/1/royj_1.pd
Optimisation, Optimal Control and Nonlinear Dynamics in Electrical Power, Energy Storage and Renewable Energy Systems
The electrical power system is undergoing a revolution enabled by advances in telecommunications, computer hardware and software, measurement, metering systems, IoT, and power electronics. Furthermore, the increasing integration of intermittent renewable energy sources, energy storage devices, and electric vehicles and the drive for energy efficiency have pushed power systems to modernise and adopt new technologies. The resulting smart grid is characterised, in part, by a bi-directional flow of energy and information. The evolution of the power grid, as well as its interconnection with energy storage systems and renewable energy sources, has created new opportunities for optimising not only their techno-economic aspects at the planning stages but also their control and operation. However, new challenges emerge in the optimization of these systems due to their complexity and nonlinear dynamic behaviour as well as the uncertainties involved.This volume is a selection of 20 papers carefully made by the editors from the MDPI topic “Optimisation, Optimal Control and Nonlinear Dynamics in Electrical Power, Energy Storage and Renewable Energy Systems”, which was closed in April 2022. The selected papers address the above challenges and exemplify the significant benefits that optimisation and nonlinear control techniques can bring to modern power and energy systems
Parametric Yield of VLSI Systems under Variability: Analysis and Design Solutions
Variability has become one of the vital challenges that the
designers of integrated circuits encounter. variability becomes
increasingly important. Imperfect manufacturing process manifest
itself as variations in the design parameters. These variations
and those in the operating environment of VLSI circuits result in
unexpected changes in the timing, power, and reliability of the
circuits. With scaling transistor dimensions, process and
environmental variations become significantly important in the
modern VLSI design. A smaller feature size means that the physical
characteristics of a device are more prone to these
unaccounted-for changes. To achieve a robust design, the random
and systematic fluctuations in the manufacturing process and the
variations in the environmental parameters should be analyzed and
the impact on the parametric yield should be addressed.
This thesis studies the challenges and comprises solutions for
designing robust VLSI systems in the presence of variations.
Initially, to get some insight into the system design under
variability, the parametric yield is examined for a small circuit.
Understanding the impact of variations on the yield at the circuit
level is vital to accurately estimate and optimize the yield at
the system granularity. Motivated by the observations and results,
found at the circuit level, statistical analyses are performed,
and solutions are proposed, at the system level of abstraction, to
reduce the impact of the variations and increase the parametric
yield.
At the circuit level, the impact of the supply and threshold
voltage variations on the parametric yield is discussed. Here, a
design centering methodology is proposed to maximize the
parametric yield and optimize the power-performance trade-off
under variations. In addition, the scaling trend in the yield loss
is studied. Also, some considerations for design centering in the
current and future CMOS technologies are explored.
The investigation, at the circuit level, suggests that the
operating temperature significantly affects the parametric yield.
In addition, the yield is very sensitive to the magnitude of the
variations in supply and threshold voltage. Therefore, the spatial
variations in process and environmental variations make it
necessary to analyze the yield at a higher granularity. Here,
temperature and voltage variations are mapped across the chip to
accurately estimate the yield loss at the system level.
At the system level, initially the impact of process-induced
temperature variations on the power grid design is analyzed. Also,
an efficient verification method is provided that ensures the
robustness of the power grid in the presence of variations. Then,
a statistical analysis of the timing yield is conducted, by taking
into account both the process and environmental variations. By
considering the statistical profile of the temperature and supply
voltage, the process variations are mapped to the delay variations
across a die. This ensures an accurate estimation of the timing
yield. In addition, a method is proposed to accurately estimate
the power yield considering process-induced temperature and supply
voltage variations. This helps check the robustness of the
circuits early in the design process.
Lastly, design solutions are presented to reduce the power
consumption and increase the timing yield under the variations. In
the first solution, a guideline for floorplaning optimization in
the presence of temperature variations is offered. Non-uniformity
in the thermal profiles of integrated circuits is an issue that
impacts the parametric yield and threatens chip reliability.
Therefore, the correlation between the total power consumption and
the temperature variations across a chip is examined. As a result,
floorplanning guidelines are proposed that uses the correlation to
efficiently optimize the chip's total power and takes into account
the thermal uniformity.
The second design solution provides an optimization methodology
for assigning the power supply pads across the chip for maximizing
the timing yield. A mixed-integer nonlinear programming (MINLP)
optimization problem, subject to voltage drop and current
constraint, is efficiently solved to find the optimum number and
location of the pads