45 research outputs found

    An integrated placement and routing approach

    Get PDF
    As the feature size continues scaling down, interconnects become the major contributor of signal delay. Since interconnects are mainly determined by placement and routing, these two stages play key roles to achieve high performance. Historically, they are divided into two separate stages to make the problem tractable. Therefore, the routing information is not available during the placement process. Net models such as HPWL, are employed to approximate the routing to simplify the placement problem. However, the good placement in terms of these objectives may not be routable at all in the routing stage because different objectives are optimized in placement and routing stages. This inconsistancy makes the results obtained by the two-step optimization method far from optimal;In order to achieve high-quality placement solution and ensure the following routing, we propose an integrated placement and routing approach. In this approach, we integrate placement and routing into the same framework so that the objective optimized in placement is the same as that in routing. Since both placement and routing are very hard problems (NP-hard), we need to have very efficient algorithms so that integrating them together will not lead to intractable complexity;In this dissertation, we first develop a highly efficient placer - FastPlace 3.0 for large-scale mixed-size placement problem. Then, an efficient and effective detailed placer - FastDP is proposed to improve global placement by moving standard cells in designs. For high-degree nets in designs, we propose a novel performance-driven topology design algorithm to generate good topologies to achieve very strict timing requirement. In the routing phase, we develop two global routers, FastRoute and FastRoute 2.0. Compared to traditional global routers, they can generate better solutions and are two orders of magnitude faster. Finally, based on these efficient and high-quality placement and routing algorithms, we propose a new flow which integrates placement and routing together closely. In this flow, global routing is extensively applied to obtain the interconnect information and direct the placement process. In this way, we can get very good placement solutions with guaranteed routability

    A Multiple-objective ILP based Global Routing Approach for VLSI ASIC Design

    Get PDF
    A VLSI chip can today contain hundreds of millions transistors and is expected to contain more than 1 billion transistors in the next decade. In order to handle this rapid growth in integration technology, the design procedure is therefore divided into a sequence of design steps. Circuit layout is the design step in which a physical realization of a circuit is obtained from its functional description. Global routing is one of the key subproblems of the circuit layout which involves finding an approximate path for the wires connecting the elements of the circuit without violating resource constraints. The global routing problem is NP-hard, therefore, heuristics capable of producing high quality routes with little computational effort are required as we move into the Deep Sub-Micron (DSM) regime. In this thesis, different approaches for global routing problem are first reviewed. The advantages and disadvantages of these approaches are also summarized. According to this literature review, several mathematical programming based global routing models are fully investigated. Quality of solution obtained by these models are then compared with traditional Maze routing technique. The experimental results show that the proposed model can optimize several global routing objectives simultaneously and effectively. Also, it is easy to incorporate new objectives into the proposed global routing model. To speedup the computation time of the proposed ILP based global router, several hierarchical methods are combined with the flat ILP based global routing approach. The experimental results indicate that the bottom-up global routing method can reduce the computation time effectively with a slight increase of maximum routing density. In addition to wire area, routability, and vias, performance and low power are also important goals in global routing, especially in deep submicron designs. Previous efforts that focused on power optimization for global routing are hindered by excessively long run times or the routing of a subset of the nets. Accordingly, a power efficient multi-pin global routing technique (PIRT) is proposed in this thesis. This integer linear programming based techniques strives to find a power efficient global routing solution. The results indicate that an average power savings as high as 32\% for the 130-nm technology can be achieved with no impact on the maximum chip frequency

    Comparing energy and latency of asynchronous and synchronous NoCs for embedded SoCs

    Get PDF
    Journal ArticlePower consumption of on-chip interconnects is a primary concern for many embedded system-on-chip (SoC) applications. In this paper, we compare energy and performance characteristics of asynchronous (clockless) and synchronous network on-chip implementations, optimized for a number of SoC designs. We adapted the COSI-2.0 framework with ORION 2.0 router and wire models for synchronous network generation. Our own tool, ANetGen, specifies the asynchronous network by determining the topology with simulated-annealing and router locations with force-directed placement. It uses energy and delay models from our 65 nm bundled-data router design. SystemC simulations varied traffic burstiness using the self-similar b-model. Results show that the asynchronous network provided lower median and maximum message latency, especially under bursty traffic, and used far less router energy with a slight overhead for the interrouter wires

    Timing optimization during the physical synthesis of cell-based VLSI circuits

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2016.Abstract : The evolution of CMOS technology made possible integrated circuits with billions of transistors assembled into a single silicon chip, giving rise to the jargon Very-Large-Scale Integration (VLSI). The required clock frequency affects the performance of a VLSI circuit and induces timing constraints that must be properly handled by synthesis tools. During the physical synthesis of VLSI circuits, several optimization techniques are used to iteratively reduce the number of timing violations until the target clock frequency is met. The dramatic increase of interconnect delay under technology scaling represents one of the major challenges for the timing closure of modern VLSI circuits. In this scenario, effective interconnect synthesis techniques play a major role. That is why this thesis targets two timing optimization problems for effective interconnect synthesis: Incremental Timing-Driven Placement (ITDP) and Incremental Timing-Driven Layer Assignment (ITLA). For solving the ITDP problem, this thesis proposes a new Lagrangian Relaxation formulation that minimizes timing violations for both setup and hold timing constraints. This work also proposes a netbased technique that uses Lagrange multipliers as net-weights, which are dynamically updated using an accurate timing analyzer. The netbased technique makes use of a novel discrete search to relocate cells by employing the Euclidean distance to define a proper neighborhood. For solving the ITLA problem, this thesis proposes a network flow approach that handles simultaneously critical and non-critical segments, and exploits a few flow conservation conditions to extract timing information for each net segment individually, thereby enabling the use of an external timing engine. The experimental validation using benchmark suites derived from industrial circuits demonstrates the effectiveness of the proposed techniques when compared with state-of-the-art works.A evolução da tecnologia CMOS viabilizou a fabricação de circuitos integrados contendo bilhões de transistores em uma única pastilha de silício, dando origem ao jargão Very-Large-Scale Integration (VLSI). A frequência-alvo de operação de um circuito VLSI afeta o seu desempenho e induz restrições de timing que devem ser manipuladas pelas ferramentas de síntese. Durante a síntese física de circuitos VLSI, diversas técnicas de otimização são usadas para iterativamente reduzir o número de violações de timing até que a frequência-alvo de operação seja atingida. O aumento dramático do atraso das interconexões devido à evolução tecnológica representa um dos maiores desafios para o fluxo de timing closure de circuitos VLSI contemporâneos. Nesse cenário, técnicas de síntese de interconexão eficientes têm um papel fundamental. Por este motivo, esta tese aborda dois problemas de otimização de timing para uma síntese eficiente das interconexões de um circuito VLSI: Incremental Timing-Driven Placement (ITDP) e Incremental Timing-Driven Layer Assignment (ITLA). Para resolver o problema de ITDP, esta tese propõe uma nova formulação utilizando Relaxação Lagrangeana que tem por objetivo a minimização simultânea das violações de timing para restrições do tipo setup e hold. Este trabalho também propõe uma técnica que utiliza multiplicadores de Lagrange como pesos para as interconexões, os quais são atualizados dinamicamente através dos resultados de uma ferramenta de análise de timing. Tal técnica realoca as células do circuito por meio de uma nova busca discreta que adota a distância Euclidiana como vizinhança.Para resolver o problema de ITLA, esta tese propõe uma abordagem em fluxo em redes que otimiza simultaneamente segmentos críticos e não-críticos, e explora algumas condições de fluxo para extrair as informações de timing para cada segmento individualmente, permitindo assim o uso de uma ferramenta de timing externa. A validação experimental, utilizando benchmarks derivados de circuitos industriais, demonstra a eficiência das técnicas propostas quando comparadas com trabalhos estado da arte

    3D IC optimal layout design. A parallel and distributed topological approach

    Full text link
    The task of 3D ICs layout design involves the assembly of millions of components taking into account many different requirements and constraints such as topological, wiring or manufacturability ones. It is a NP-hard problem that requires new non-deterministic and heuristic algorithms. Considering the time complexity, the commonly applied Fiduccia-Mattheyses partitioning algorithm is superior to any other local search method. Nevertheless, it can often miss to reach a quasi-optimal solution in 3D spaces. The presented approach uses an original 3D layout graph partitioning heuristics implemented with use of the extremal optimization method. The goal is to minimize the total wire-length in the chip. In order to improve the time complexity a parallel and distributed Java implementation is applied. Inside one Java Virtual Machine separate optimization algorithms are executed by independent threads. The work may also be shared among different machines by means of The Java Remote Method Invocation system.Comment: 26 pages, 9 figure

    A predictive distributed congestion metric with application to technology mapping

    Full text link
    corecore