25 research outputs found
Graphics Processing Unit-Based Computer-Aided Design Algorithms for Electronic Design Automation
The electronic design automation (EDA) tools are a specific set of software that play important roles in modern integrated circuit (IC) design. These software automate the design processes of IC with various stages. Among these stages, two important EDA design tools are the focus of this research: floorplanning and global routing. Specifically, the goal of this study is to parallelize these two tools such that their execution time can be significantly shortened on modern multi-core and graphics processing unit (GPU) architectures. The GPU hardware is a massively parallel architecture, enabling thousands of independent threads to execute concurrently. Although a small set of EDA tools can benefit from using GPU to accelerate their speed, most algorithms in this field are designed with the single-core paradigm in mind. The floorplanning and global routing algorithms are among the latter, and difficult to render any speedup on the GPU due to their inherent sequential nature.
This work parallelizes the floorplanning and global routing algorithm through a novel approach and results in significant speedups for both tools implemented on the GPU hardware. Specifically, with a complete overhaul of solution space and design space exploration, a GPU-based floorplanning algorithm is able to render 4-166X speedup, while achieving similar or improved solutions compared with the sequential algorithm. The GPU-based global routing algorithm is shown to achieve significant speedup against existing state-of-the-art routers, while delivering competitive solution quality. Importantly, this parallel model for global routing renders a stable solution that is independent from the level of parallelism. In summary, this research has shown that through a design paradigm overhaul, sequential algorithms can also benefit from the massively parallel architecture. The findings of this study have a positive impact on the efficiency and design quality of modern EDA design flow
Analytical Layer Planning for Nanometer VLSI Designs
In this thesis, we proposed an intermediate sub-process between placement and routing stage in physical design. The algorithm is for generating layer guidance for post-placement optimization technique especially buffer insertion. This issue becomes critical in nowadays VLSI chip design due to the factor of timing, congestion, and increasingly non-uniform parasitic among different metal layers. Besides, as a step before routing, this layer planning algorithm accounts for routability by considering minimized overlap area between different nets. Moreover, layer directive information which is a crucial concern in industrial design is also considered in the algorithm.
The core problem is formulated as nonlinear programming problem which is composed of objective function and constraints. The problem is further solved by conjugate gradient method. The whole algorithm is implemented by C++ under Linux operating system and tested on ISPD2008 Global Routing Contest Benchmarks. The experiment results are shown in the end of this thesis and confirm the effectiveness of our approach especially in routability aspect
3D Global Router: a Study to Optimize Congestion, Wirelength and Via for Circuit Layout
The increasing size of integrated circuits and aggressive shrinking process feature size for IC manufacturing process poses signicant challenges on traditional physical design problems. Various design rules signicantly complicate the physical design problems and large problem size abides nothing but extremely e cient techniques. Leading physical design tools have to be powerful enough to handle complex design demands and be nimble enough to waste no runtime. This thesis studies the challenges faced by global routing problem, one of the traditional physical design problems that needs to be pushed to its new limit. This work proposes three e ective tools to tackle congestion, wire and via optimization in global routing process, from three di erent aspects.
The number of vias generated during the global routing stage is a critical factor for the yield of integrated circuits. However, most global routers only approach the problem by charging a cost for vias in the maze routing cost function. The first work of this thesis, FastRoute 4.0 presents a global router that addresses the via number optimization problem throughout the entire global routing ow. It introduces the via aware Steiner tree generation, 3-bend routing and layer assignment with careful ordering to reduce via count. The integration of these three techniques with existing academic global routers achieves signicant reduction in via count without any sacrice in runtime.
Despite of the recent development for popular rip-up and reroute framework, the congestion elimination process remains arbitrary and requires signicant tuning. Global routing has congestion elimination as the first and foremost priority and congestion issue becomes increasingly severe due to timing requirements, design for manufacturability. The second work of this thesis, an auction algorithm based pre-processing framework (APF) for global routing focuses on how to eliminate congestion e ectively. In order to achieve more consistent congestion elimination, the framework uses auction based detour techniques to alleviate the impacts of greedy sequential manner of maze routing, which remains as a major drawback in the most popular global routing framework. In the framework, APF first identies the most congested global routing locations by an interval over ow lower bound technique. Then APF uses auction based detour algorithm to compute which nets to detour and where to detour. The framework can be applied to any global routers and would help them to achieve signicant improvement in both solution quality and runtime.
The third work in this thesis combines the advantage of the two framework used to minimize via usage in global routing: 3D routers with good solution quality and e cient 2D routers with layer assignment process. It results in a new multi-level 3D global router called MGR (multi-level global router) that combines the advantage of both kinds. MGR resorts to an e cient multi-level framework to reroute nets in the congested region on the 3D grid graph. Routing on the coarsened grid graph speeds up the global router while 3D routing introduces less vias. The powerful multi-level rerouting framework wraps three innovative routing techniques together: an adaptive resource reservation technique in coarsening process, a new 3-terminal maze routing algorithm and a network flow based solution propagation method in uncoarsening process. As a result, MGR can achieve the solution quality close to 3D routers with comparable runtime of 2D routers
High-performance Global Routing for Trillion-gate Systems-on-Chips.
Due to aggressive transistor scaling, modern-day CMOS circuits have continually increased in both complexity and productivity. Modern semiconductor designs have narrower and more resistive wires, thereby shifting the performance bottleneck to interconnect delay. These trends considerably impact timing closure and call for improvements in high-performance physical design tools to keep pace with the current state of IC innovation.
As leading-edge designs may incorporate tens of millions of gates, algorithm and software scalability are crucial to achieving reasonable turnaround time. Moreover, with decreasing device sizes, optimizing traditional objectives is no longer sufficient.
Our research focuses on (i) expanding the capabilities of standalone global routing, (ii) extending global routing for use in different design applications, and (iii) integrating routing within broader physical design optimizations and flows, e.g., congestion-driven
placement. Our first global router relies on integer-linear programming (ILP), and can solve fairly large problem instances to optimality. Our second iterative global router relies on Lagrangian relaxation, where we relax the routing violation constraints to allowing routing overflow at a penalty. In both approaches, our desire is to give the router the maximum degree of freedom within a specified context. Empirically, both routers produce competitive results within a reasonable amount of runtime. To improve routability, we explore the incorporation of routing with placement, where the router estimates congestion and feeds this information to the placer. In turn, the emphasis on runtime is heightened, as the router will be invoked multiple times. Empirically, our placement-and-route framework significantly improves the final solution’s routability than performing the steps sequentially. To further enhance routability-driven placement, we (i) leverage incrementality to generate fast and accurate congestion maps, and (ii) develop several techniques to relieve cell-based and layout-based congestion. To broaden the scope of routing, we integrate a global router in a chip-design flow that addresses the buffer explosion problem.PHDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/98025/1/jinhu_1.pd
Analytical Layer Planning for Nanometer VLSI Designs
In this thesis, we proposed an intermediate sub-process between placement and routing stage in physical design. The algorithm is for generating layer guidance for post-placement optimization technique especially buffer insertion. This issue becomes critical in nowadays VLSI chip design due to the factor of timing, congestion, and increasingly non-uniform parasitic among different metal layers. Besides, as a step before routing, this layer planning algorithm accounts for routability by considering minimized overlap area between different nets. Moreover, layer directive information which is a crucial concern in industrial design is also considered in the algorithm.
The core problem is formulated as nonlinear programming problem which is composed of objective function and constraints. The problem is further solved by conjugate gradient method. The whole algorithm is implemented by C++ under Linux operating system and tested on ISPD2008 Global Routing Contest Benchmarks. The experiment results are shown in the end of this thesis and confirm the effectiveness of our approach especially in routability aspect
A Techniques for Scalable and Effective Routability Evaluation
Routing congestion has become a critical layout challenge in nanoscale circuits since it is a critical factor in determining the routability of a design. An unroutable design is not useful even though it closes on all other design metrics. Fast design closure can only be achieved by accurately evaluating whether a design is routable or not early in the design cycle. Lately, it has become common to use a “light mode ” version of a global router to quickly evaluate the routability of a given placement. This approach suffers from three weaknesses: (i) it does not adequately model local routing resources, which can cause incorrect routability predictions that are only detected late, during detailed routing, (ii) the congestion maps obtained by it tend to have isolated hot spots surrounded by noncongested spots, called “noisy hot spots”, which further affects the accuracy in routability evaluation, (iii) the metrics used to represent congestion may yield numbers that do not provide sufficient intuition to the designer; moreover, they may often fail to predict the routability accurately. This paper presents solutions to these issues. First, we propose three approaches to model local routing resources. Second, we propose a smoothing technique to reduce the number of noisy hot spots and obtain a more accurate routability evaluation result. Finally, we develop a new metric which represents congestion maps with higher fidelity. We apply the proposed techniques to several industrial circuits and demonstrate that one can better predict and evaluate design routability, and congestion mitigation tools can perform muc
A Layer Centric VLSI Physical Design Methodology Considering Non-uniform Metal Stacks
VLSI technology scaling has caused interconnect delay to increasingly dominate the overall chip performance. Optimization techniques such as buffer insertion, wire sizing and layer assignment play critical roles in successful timing closure for chip designs. For several VLSI technology generations, designers have confronted the challenges associated with increasing wire delays. One industrial solution is to add layers of thicker metal to the wiring stacks. However, the existing physical synthesis tools are not effective enough to handle these new thick metal layers. Thus, it is necessary to design a new flow to provide better communication among layer planning, buffering, routing and different optimization engines. In this thesis, our work proposes a new design flow, Layer Centric Design Flow, to perform congestion mitigation and timing optimization with layer directives. Our design flow balances buffer and routing resources so that the design benefits from the availability of thick metal layers and reduces buffer usage while maintaining routability as well as performance
Initial detailed routing algorithms
In this work, we present a study of the problem of routing in the context of the VLSI physical synthesis flow. We study the fundamental routing algorithms such as maze routing, A*, and Steiner tree-based algorithms, as well as some global routing algorithms, namely FastRoute 4.0 and BoxRouter 2.0. We dissect some of the major state of the art initial detailed routing tools, such as RegularRoute, TritonRoute, SmartDR and Dr.CU 2.0. We also propose an initial detailed routing flow, and present an implementation of the proposed routing flow, with a track assignment technique that models the problem as an instance of the maximum independent weighted set (MWIS) and utilizes integer linear programming (ILP) as a solver. The implementation of the proposed initial detailed routing flow also includes an implementation of multiple-source and multiple-target A* for terminal andnet connection with adjustable rules and weights. Finally, we also present a study of the results obtained by the implementation of the proposed initial detailed routing flow and a comparison with the ISPD 2019 contest winners, considering the ISPD 2019 and benchmark suite and evaluation tools.Neste trabalho, apresentamos um estudo do problema de roteamento no contexto do fluxo de síntese física de circuitos integrados VLSI. Nós estudamos algoritmos de roteamento fundamentais como roteamento de labirinto, A* e baseados em árvores de Steiner, além de alguns algoritmos de roteamento global como FastRoute 4.0 e BoxRouter 2.0. Nós dissecamos alguns dos principais trabalhos de roteamento detalhado inicial do estado da arte, como RegularRoute, TritonRoute, SmartDR e Dr.CU 2.0. Também propomos um fluxo de roteamento detalhado inicial, e apresentamos uma implementação do fluxo de roteametno proposto, com uma técnica de assinalamento de trilhas que modela o problema como uma instância do problema do conjunto independente de peso máximo e usa programação linear inteira como um resolvedor. A implementação do fluxo de rotemaento detalhado inicial proposto também inclui uma implementação de um A* com múltiplas fontes e múltiplos destinos para conexão de terminais e redes, com regras e pesos ajustáveis. Por fim, nós apresentamos um estudo dos resultados obtidos pela implementação do fluxo de roteamento detalhado inicial proposto e comparamos com os vencedores do ISPD 2019 contest considerando a suíte de teste e ferramentas de avaliação do ISPD 2019
High-Performance Placement and Routing for the Nanometer Scale.
Modern semiconductor manufacturing facilitates single-chip electronic systems that only five years ago required ten to twenty chips. Naturally, design complexity has grown within this period. In contrast to this growth, it is becoming common in the industry to limit design team size which places a heavier burden on design automation tools.
Our work identifies new objectives, constraints and concerns in the physical design of systems-on-chip, and develops new computational techniques to address them. In addition to faster and more relevant design optimizations, we demonstrate that traditional design flows based on ``separation of concerns'' produce unnecessarily suboptimal layouts. We develop new integrated optimizations that streamline traditional chains of loosely-linked design tools. In particular, we bridge the gap between mixed-size placement and routing by updating the objective of global and detail placement to a more accurate estimate of routed wirelength. To this we add sophisticated whitespace allocation, and the combination provides increased routability, faster routing,
shorter routed wirelength, and the best via counts of published techniques. To further improve post-routing design metrics, we present new global routing techniques based on Discrete Lagrange Multipliers (DLM) which produce the best routed wirelength results on recent benchmarks. Our work culminates in the integration of our routing techniques within an incremental placement flow to
improve detailed routing solutions, shrink die sizes and reduce total chip cost.
Not only do our techniques improve the quality and cost of designs, but also simplify design automation software implementation in many cases. Ultimately, we reduce the time needed for design closure through improved tool fidelity and the use of our incremental techniques for placement and routing.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/64639/1/royj_1.pd