62 research outputs found

    High performance algorithms for large scale placement problem

    Get PDF
    Placement is one of the most important problems in electronic design automation (EDA). An inferior placement solution will not only affect the chip’s performance but might also make it nonmanufacturable by producing excessive wirelength, which is beyond available routing resources. Although placement has been extensively investigated for several decades, it is still a very challenging problem mainly due to that design scale has been dramatically increased by order of magnitudes and the increasing trend seems unstoppable. In modern design, chips commonly integrate millions of gates that require over tens of metal routing layers. Besides, new manufacturing techniques bring out new requests leading to that multi-objectives should be optimized simultaneously during placement. Our research provides high performance algorithms for placement problem. We propose (i) a high performance global placement core engine POLAR; (ii) an efficient routability-driven placer POLAR 2.0, which is an extension of POLAR to deal with routing congestion; (iii) an ultrafast global placer POLAR 3.0, which explore parallelism on POLAR and can make full use of multi-core system; (iv) some efficient triple patterning lithography (TPL) aware detailed placement algorithms

    MLCAD: A Survey of Research in Machine Learning for CAD Keynote Paper

    Get PDF

    CHIP: Clustering hotspots in layout using integer programming

    Get PDF
    Clustering algorithms have been explored in recent years to solve hotspot clustering problem in Integrated Circuit design. With various applications in Design for Manufacturability flow such as hotspot library generation, systematic yield optimization and design space exploration, generating good quality clusters along with their representative clips is of utmost importance. With several generic clustering algorithms at our disposal, hotspots can be clustered based on the distance metric defined while satisfying some tolerance conditions. However, the clusters generated from generic clustering algorithms need not achieve optimal results. In this paper, we introduce two optimal integer linear programming formulations based on triangle inequality to solve the problem of minimizing cluster count while satisfying given constraints. Apart from minimizing cluster count, we generate representative clips that best represent the clusters formed. We achieve better cluster count for both formulations in most test cases as compared to the results published in literature on the ICCAD 2016 contest benchmarks as well as the reference results reported in the ICCAD 2016 contest websit

    On the marriage of strings

    Get PDF
    Given two sets L and R of strings such that R is the result of applying unknown transformations to L, match every string in L to its corresponding transformed string in R. This problem was proposed at the 2018 International Conference on Computer-Aided Design (ICCAD) CAD Contest, to which this work studies a deterministic solution using graph theory and approximate string matching. From graph theory, bipartite matching and stable marriage are reviewed; from approximate string matching, suffix array, edit distance, q-gram distance and q-gram index are considered. Three other algorithms are proposed based on these concepts, two of which only serve the purpose of being baselines to the others. The proposed solution itself is divided in three steps: the construction of filters (or indexes) on L and R with suffix array or q-gram index; the calculation of the adjacency (preference) lists for the graph G = ({L?R},E) with edit or q-gram distance as edge weights; and, finally, the computation of the bipartite matching. A framework for benchmarking the run time and accuracy of this approach was built. It also allows an easy switching between the available algorithms. The most significant outcome of the benchmarks is that the filtering step is the bottleneck of the proposed methodology. It affects accuracy and prevents a proper reasoning about the impact of some of the compared algorithms. Never the less,the works of the winning teams of the contest managed to achieve 100% of accuracy. When they are published, it will be possible to study their solutions and perhapsbetter understand the compromises of our choices in order to propose improvements to our approach

    The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

    Full text link
    Reinforcement learning (RL) for physical design of silicon chips in a Google 2021 Nature paper stirred controversy due to poorly documented claims that raised eyebrows and attracted critical media coverage. The Nature paper withheld most inputs needed to produce reported results and some critical steps in the methodology. But two separate evaluations filled in the gaps and demonstrated that Google RL lags behind human designers, behind a well-known algorithm (Simulated Annealing), and also behind generally-available commercial software. Crosschecked data indicate that the integrity of the Nature paper is substantially undermined owing to errors in the conduct, analysis and reporting.Comment: 14 pages, 1 figure, 3 table

    Incremental Timing-Driven Placement with Displacement Constraint

    Get PDF
    In the modern deep-submicron Very Large Integrated Circuit(VLSI) design flow intercon- nect delays are becoming major limiting factor for timing closure. Traditional placement algorithms such as routability-driven placement (improves routability) and wirelength- driven placement (reduces total wirelength) are no longer sufficient to close timing. To this end, timing-driven placement plays a crucial role in reducing the interconnect delay through timing critical paths (paths with timing violations/negative slacks) of the design and thereby achieving specific performance/clock frequency. In the placement flow, timing information about the design can be incorporated during global placement and/or incremental/detailed placement. Although, over the years, there has been significant advances in the quality of the global placement, there is a growing need for high performance incremental timing-driven placement due to the lack of accurate interconnect information during global placement. Moreover, incremental timing-driven placement is essential to recover timing while preserving the other optimization objectives such as total wirelength, routing congestion, and so forth which are optimized at the early stages of the design flow. This thesis proposes a simple, yet efficient, incremental timing-driven placement algo- rithm that seeks to find optimized locations for standard cells so that the total negative slack of the design can be maximized. Our algorithm consists two stages: (1) Global Move which positions standard cells inside a critical bounding box to eliminate timing violations on timing critical nets; and (2) Local Move which provides further timing improvement by finely adjusting the current locations of the standard cells within a local region. We evaluate our algorithm using ICCAD-2014 timing-driven placement contest bench- marks. The results show that, on average, our technique eliminates 94% and 30% of the late and early total negative slacks, respectively, and, 82% and 27% of the late and early worst negative slacks, respectively, under short and long displacement constraints. The 1st-place team of the contest improves late and early total negative slacks by 90% and 39%, respectively, and improves late and early worst negative slack by 76% and 32%, re- spectively. Taking into account both timing violation improvement and the placement quality (i.e., other objectives), on average, we outperform the 1st-place team by 3% in terms of the ICCAD-2014 contest quality score and our technique is 4.6× faster in terms of runtime

    Timing optimization during the physical synthesis of cell-based VLSI circuits

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2016.Abstract : The evolution of CMOS technology made possible integrated circuits with billions of transistors assembled into a single silicon chip, giving rise to the jargon Very-Large-Scale Integration (VLSI). The required clock frequency affects the performance of a VLSI circuit and induces timing constraints that must be properly handled by synthesis tools. During the physical synthesis of VLSI circuits, several optimization techniques are used to iteratively reduce the number of timing violations until the target clock frequency is met. The dramatic increase of interconnect delay under technology scaling represents one of the major challenges for the timing closure of modern VLSI circuits. In this scenario, effective interconnect synthesis techniques play a major role. That is why this thesis targets two timing optimization problems for effective interconnect synthesis: Incremental Timing-Driven Placement (ITDP) and Incremental Timing-Driven Layer Assignment (ITLA). For solving the ITDP problem, this thesis proposes a new Lagrangian Relaxation formulation that minimizes timing violations for both setup and hold timing constraints. This work also proposes a netbased technique that uses Lagrange multipliers as net-weights, which are dynamically updated using an accurate timing analyzer. The netbased technique makes use of a novel discrete search to relocate cells by employing the Euclidean distance to define a proper neighborhood. For solving the ITLA problem, this thesis proposes a network flow approach that handles simultaneously critical and non-critical segments, and exploits a few flow conservation conditions to extract timing information for each net segment individually, thereby enabling the use of an external timing engine. The experimental validation using benchmark suites derived from industrial circuits demonstrates the effectiveness of the proposed techniques when compared with state-of-the-art works.A evolução da tecnologia CMOS viabilizou a fabricação de circuitos integrados contendo bilhões de transistores em uma única pastilha de silício, dando origem ao jargão Very-Large-Scale Integration (VLSI). A frequência-alvo de operação de um circuito VLSI afeta o seu desempenho e induz restrições de timing que devem ser manipuladas pelas ferramentas de síntese. Durante a síntese física de circuitos VLSI, diversas técnicas de otimização são usadas para iterativamente reduzir o número de violações de timing até que a frequência-alvo de operação seja atingida. O aumento dramático do atraso das interconexões devido à evolução tecnológica representa um dos maiores desafios para o fluxo de timing closure de circuitos VLSI contemporâneos. Nesse cenário, técnicas de síntese de interconexão eficientes têm um papel fundamental. Por este motivo, esta tese aborda dois problemas de otimização de timing para uma síntese eficiente das interconexões de um circuito VLSI: Incremental Timing-Driven Placement (ITDP) e Incremental Timing-Driven Layer Assignment (ITLA). Para resolver o problema de ITDP, esta tese propõe uma nova formulação utilizando Relaxação Lagrangeana que tem por objetivo a minimização simultânea das violações de timing para restrições do tipo setup e hold. Este trabalho também propõe uma técnica que utiliza multiplicadores de Lagrange como pesos para as interconexões, os quais são atualizados dinamicamente através dos resultados de uma ferramenta de análise de timing. Tal técnica realoca as células do circuito por meio de uma nova busca discreta que adota a distância Euclidiana como vizinhança.Para resolver o problema de ITLA, esta tese propõe uma abordagem em fluxo em redes que otimiza simultaneamente segmentos críticos e não-críticos, e explora algumas condições de fluxo para extrair as informações de timing para cada segmento individualmente, permitindo assim o uso de uma ferramenta de timing externa. A validação experimental, utilizando benchmarks derivados de circuitos industriais, demonstra a eficiência das técnicas propostas quando comparadas com trabalhos estado da arte

    RTL-aware dataflow-driven macro placement

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.When RTL designers define the hierarchy of a system, they exploit their knowledge about the conceptual abstractions devised during the design and the functional interactions between the logical components. This valuable information is often lost during physical synthesis. This paper proposes a novel multi-level approach for the macro placement problem of complex designs dominated by macro blocks, typically memories. By taking advantage of the hierarchy tree, the netlist is divided into blocks containing macros and standard cells, and their dataflow affinity is inferred considering the latency and flow width of their interaction. The layout is represented using slicing structures and generated with a top-down algorithm capable of handling blocks with both hard and soft components, aimed at wirelength minimization. These techniques have been applied to a set of large industrial circuits and compared against both a commercial floorplanner and handcrafted floorplans by expert back-end engineers. The proposed approach outperforms the commercial tool and produces solutions with similar quality to the best handcrafted floorplans. Therefore, the generated floorplans provide an excellent starting point for the physical design iterations and contribute to reduce turn-around time significantly.Peer ReviewedPostprint (author's final draft

    Advancing Hardware Security Using Polymorphic and Stochastic Spin-Hall Effect Devices

    Full text link
    Protecting intellectual property (IP) in electronic circuits has become a serious challenge in recent years. Logic locking/encryption and layout camouflaging are two prominent techniques for IP protection. Most existing approaches, however, particularly those focused on CMOS integration, incur excessive design overheads resulting from their need for additional circuit structures or device-level modifications. This work leverages the innate polymorphism of an emerging spin-based device, called the giant spin-Hall effect (GSHE) switch, to simultaneously enable locking and camouflaging within a single instance. Using the GSHE switch, we propose a powerful primitive that enables cloaking all the 16 Boolean functions possible for two inputs. We conduct a comprehensive study using state-of-the-art Boolean satisfiability (SAT) attacks to demonstrate the superior resilience of the proposed primitive in comparison to several others in the literature. While we tailor the primitive for deterministic computation, it can readily support stochastic computation; we argue that stochastic behavior can break most, if not all, existing SAT attacks. Finally, we discuss the resilience of the primitive against various side-channel attacks as well as invasive monitoring at runtime, which are arguably even more concerning threats than SAT attacks.Comment: Published in Proc. Design, Automation and Test in Europe (DATE) 201
    corecore