9 research outputs found
Custom Cell Placement Automation for Asynchronous VLSI
Asynchronous Very-Large-Scale-Integration (VLSI) integrated circuits have demonstrated many advantages over their synchronous counterparts, including low power consumption, elastic pipelining, robustness against manufacturing and temperature variations, etc. However, the lack of dedicated electronic design automation (EDA) tools, especially physical layout automation tools, largely limits the adoption of asynchronous circuits. Existing commercial placement tools are optimized for synchronous circuits, and require a standard cell library provided by semiconductor foundries to complete the physical design. The physical layouts of cells in this library have the same height to simplify the placement problem and the power distribution network. Although the standard cell methodology also works for asynchronous designs, the performance is inferior compared with counterparts designed using the full-custom design methodology. To tackle this challenge, we propose a gridded cell layout methodology for asynchronous circuits, in which the cell height and cell width can be any integer multiple of two grid values. The gridded cell approach combines the shape regularity of standard cells with the size flexibility of full-custom layouts. Therefore, this approach can achieve a better space utilization ratio and lower wire length for asynchronous designs. Experiments have shown that the gridded cell placement approach reduces area without impacting the routability. We have also used this placer to tape out a chip in a 65nm process technology, demonstrating that our placer generates design-rule clean results
Recommended from our members
Nanometer VLSI placement and optimization for multi-objective design closure
In a VLSI physical synthesis flow, placement directly defines the interconnection,
which affects many other design objectives, such as timing, power consumption,
congestion, and thermal issues. With the scaling of technology, the relative interconnect
delay increases dramatically. As a result, placement has become a bottleneck
in deep sub-micron physical synthesis. In this dissertation, I propose several
optimization algorithms from global placement, placement migration, timing driven
placements, to incremental power optimizations for multi-objective VLSI design
closure. The first work is DPlace, a new global placement algorithm that scales
well to the modern large-scale circuit placement problems. DPlace simulates the
natural diffusion process to spread cells smoothly over the placement region, and
uses both analytical and discrete techniques to improve the wire length. However,
global placement is never sufficient for multi-objective design closure, a variety of
design objectives have to be improved incrementally, such as timing, routing congestion,
signal integrity, and heat distribution. Placement migration is a critical step
to address the cell overlaps appearing during incremental optimizations. To achieve
high placement stability, I propose a computational geometry based placement migration
flow to cope with placement changes, and a new stability metric to measure
the “similarity” between two placements accurately. Our placement migration algorithm
has clear advantage over conventional legalization algorithms such that the
neighborhood characteristics of the original placement are preserved. For timing
closure in high performance designs, I present a linear programming based incremental
timing driven placement to improve the timing on critical paths directly.
I further present an efficient timing driven placement algorithm (Pyramids). Two
formulations of Pyramids are proposed, which are suitable for different optimization
stages in a physical synthesis flow. Both approaches find the optimal location
for timing of a cell in constant time, through computational geometry based approaches.
For fast convergence of design closure, placement should be integrated
with other optimization techniques. I propose to combine placement, gate sizing
and Vt swapping techniques to reduce the total power consumption, especially the
leakage power, which is becoming increasingly critical for nanometer VLSI design
closure.Electrical and Computer Engineerin
Flow-based Partitioning and Fast Global Placement in Chip Design
VLSI placement is one of the major steps in the chip design process and an interesting subject of research in industry and academia. Recent chips consist of several millions of circuits connected by millions of nets. The classical placement objective of finding positions for circuits and minimizing netlength among them is an ongoing issue in optimization of chip performance. The increasing instance sizes, the tightness of timing and routability constraints impose a real challenge to the design flows and the designers, which often cannot be addressed properly without considering them explicitly within the placement. Many of the complex design methodologies follow an iterative approach, using placement several times in this process. Thus, placement runtime has a severe impact on the turnaround time in chip development. The major contributios of this thesis deal with the global placement, a common relaxation of the placement problem, which computes rough positions of the circuits minimizing the total length of wires to interconnect the. Based on the idea of subsequent quadratic netlength minimization and partitioning, as in BonnPlace [BrennerStruzynaVygen:2008], we present several new algorithms, generalized data structures and a completely new implementation of this top-down placement scheme. We introduce and formalize the concept of movebounds which are position constraints on subsets of cells. Movebounds, which can be regarded as mandatory or soft constraints, provide a mechanism to explicitly incorporate movement constraints to the placement which result from issues of timing, power and routability. With inclusive movebounds, such restrictions can be assigned to groups of circuits without any influence to other placeable objects. The other constraints, namely the exclusive movebounds, are of particular interest for semi-hierarchical approaches, as they can be used to obtain a flat view of the design and prevent cells from being placed into hierarchy units. Both provide a toolbox to the designer and allow the control of particular circuit sets without netlist manipulations. We also present a top-down partitioning scheme and extend the legalization algorithm of [BrennerVygen:2004] to be able to deal with millions of cells and dozens of movebounds efficiently. The presented algorithm can handle different types of overlapping movebounds, even in legalization, and produces significantly better results than a modern industrial tool. We present a novel partitioning algorithm for global placement. Unlike previous iterative and recursive approaches, the new method provides a global view of the problem using a novel MinCostFlow model with extremely fast and highly parallelizable local realization steps. The new flow-based partitioning can address density targets much more accurately and lowers the risk of density violations. The presented MinCostFlow model does not depend on the number of cells, making it highly interesting for large and huge designs. Moreover, the embedded flow structure responds to the chip's floorplan much better than the classical global partitioning approach. Another significant advantage of this algorithm is the fact that it can be applied to any initial placement and guarantees a feasible (fractional) solution (if one exists), improving the tool's reliability, even with movebounds and starting from placements with significant density violations. Using this method we can extend the congestion-driven placement to a combined movement, density adjustment, and cell size inflation approach. This method is able to handle movebounds and guarantees to resolve density overloads properly. Flow-based partitioning creates the opportunity of applying local, density unaware, optimization steps within global placement and allows it to break the strict recursive structure of levels and save runtime. The extended flexibility and runtime improvement are not the only advantages. The proposed flow realization, which is a combination of local quadratic programs and local partitioning, does not only yield a runtime improvement, but also seems to merge connectivity information to partitioning in a much better way than the old recursive partitioning approach. The new flow-based partitioning helps to significantly improve the results of our placement also in terms of netlength. We provide fast data structures for hierarchically clustered netlists and extend the net models Clique and Star to be applied within the clustered netlists efficiently. We show how shared-memory parallelization can be used for speeding up various routines in placement, without the loss of repeatability. In addition, we commit ourselves to the clustering problem, finding circuit groups which should be placed in the vicinity of each other. In order to provide global information for a fast bottom-up clustering, we propose to incorporate connectivity information using random walks. To this end, we show how the hitting times can be efficiently retrieved from large netlist hypergraphs. Due to the proposed model, parallel computation on sparse, shared-memory matrices can be used for computing hitting times to several targets simultaneously. Combined with a bottom-up clustering, even our preliminary approach significantly outperforms the popular BestChoice} algorithm [Nam et al. 2005]. We conclude this thesis by providing several experimental results on a large testbed of real-world chips and benchmarks demonstrating the performance of our tool. Without movebounds, our tool performs as good as a state-of-the-art force directed placer, but is more than 5x faster. We achieve the same speedup over the old BonnPlace, but produce significantly better results, on average more than 8%. With movebounds, our placements are more than 30% shorter compairing to the force-directed placer and our tool is 9x-20x faster. Our tool also produces the best results on the latest ISPD 2006 placement benchmarks
Aceleração da legalização incremental mediante o uso de árvores espaciais
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2017.Na síntese física de circuitos integrados, a etapa de legalização é responsável por remover sobreposições de células e alinhá-las com as linhas e colunas do circuito, enquanto minimiza o deslocamento das células. Esta etapa é aplicada não somente após o posicionamento global, mas também após etapas de otimização incremental tais como posicionamento incremental guiado por atraso, gate sizing e inserção de buffers. Quando utilizada em técnicas de otimização incremental, a legalização pode ser aplicada como um passo final, após cada iteração da otimização,ou de maneira incremental, após cada transformação no posicionamento. Infelizmente, técnicas recentes de legalização incremental utilizam estruturas de dados que não são adequadas para o armazenamento de informações sobre geometrias. Além disso, apesar de diferentes estratégias de legalização serem utilizadas por diferentes trabalhos de otimização incremental, estes trabalhos não apresentam resultados quantitativos do impacto destas estratégias no tempo de execução e qualidade da solução final. Este trabalho propõe uma técnica de legalização incremental utilizando uma estrutura de dados chamada R-tree, projetada para o armazenamento de informações sobre geometrias, permitindo buscas espaciais rápidas. A técnica proposta foi comparada atécnicas do estado da arte em legalização incremental, assim como às estratégias de legalização final e iterativa. Os resultados experimentais mostram que a técnica proposta é pelo menos 6 vezes mais rápida e realiza o mesmo número de legalizações quando comparado a outras técnicas de legalização incremental do estado da arte. Além disso, o algoritmo proposto é mais rápido que as estratégias de legalização final e iterativa, enquanto resulta em uma solução com perfil de densidade e comprimento das interconexões semelhante.Abstract : In the physical synthesis of digital circuits, circuit legalization removes overlaps and keeps cell alignment with circuit rows and sites while minimizing total cell displacement. Legalization is applied not only after global placement, but also after incremental optimization steps like incremental timing-driven placement, gate sizing, and buffer insertion. In the case of incremental optimization techniques, the legalization stepcan be applied as a final step, after each optimization iteration or incrementally, after each cell movement. Unfortunately, recent incremental legalization techniques employ data structures that are not suitable for handling geometry information. In addition, despite different legalization strategies are used by different works on incremental optimization, those works do not present quantitative results on how those strategies impact on the runtime and quality of the final solution. This work proposes a new legalization technique that relies on an R-tree, a data structure tailored to efficient geometry information storage, which allows for fast spatial search. The proposed technique was compared to state-of-the-art incremental legalization techniques, as well as to the final and iterative legalization strategies. Experimental results show that the proposed technique is at least 6 times faster and performs as many successful legalizations when compared to the related work on incremental legalization. In addition, it is faster than both the other two legalization strategies, while resulting in a solution with similar density profile and circuit wirelength
Incremental Timing-Driven Placement with Displacement Constraint
In the modern deep-submicron Very Large Integrated Circuit(VLSI) design flow intercon-
nect delays are becoming major limiting factor for timing closure. Traditional placement
algorithms such as routability-driven placement (improves routability) and wirelength-
driven placement (reduces total wirelength) are no longer sufficient to close timing. To
this end, timing-driven placement plays a crucial role in reducing the interconnect delay
through timing critical paths (paths with timing violations/negative slacks) of the design
and thereby achieving specific performance/clock frequency.
In the placement flow, timing information about the design can be incorporated during
global placement and/or incremental/detailed placement. Although, over the years, there
has been significant advances in the quality of the global placement, there is a growing need
for high performance incremental timing-driven placement due to the lack of accurate
interconnect information during global placement. Moreover, incremental timing-driven
placement is essential to recover timing while preserving the other optimization objectives
such as total wirelength, routing congestion, and so forth which are optimized at the early
stages of the design flow.
This thesis proposes a simple, yet efficient, incremental timing-driven placement algo-
rithm that seeks to find optimized locations for standard cells so that the total negative
slack of the design can be maximized. Our algorithm consists two stages: (1) Global Move
which positions standard cells inside a critical bounding box to eliminate timing violations
on timing critical nets; and (2) Local Move which provides further timing improvement by
finely adjusting the current locations of the standard cells within a local region.
We evaluate our algorithm using ICCAD-2014 timing-driven placement contest bench-
marks. The results show that, on average, our technique eliminates 94% and 30% of the
late and early total negative slacks, respectively, and, 82% and 27% of the late and early
worst negative slacks, respectively, under short and long displacement constraints. The
1st-place team of the contest improves late and early total negative slacks by 90% and
39%, respectively, and improves late and early worst negative slack by 76% and 32%, re-
spectively. Taking into account both timing violation improvement and the placement
quality (i.e., other objectives), on average, we outperform the 1st-place team by 3% in
terms of the ICCAD-2014 contest quality score and our technique is 4.6× faster in terms
of runtime
Diffusion-based placement migration
Placement migration is the movement of cells within an existing placement to address a variety of post-placement design closure is-sues, such as timing, routing congestion, signal integrity, and heat distribution. To fix a design problem, one would like to perturb the design as little as possible while preserving the integrity of the original placement. This work presents a new diffusion-based placement method based on a discrete approximation to a closed-form solution of the continuous diffusion equation. It has the ad-vantage of smooth spreading, which helps preserve neighborhood characteristics of the original placement. Applying this technique to placement legalization demonstrates significant improvements in wire length and timing compared to other commonly used tech-niques. Categories and Subject Descriptor
ABSTRACT Diffusion-Based Placement Migration
Placement migration is the movement of cells in an existing placement to address a variety of post placement design closure issues This work presents a new diffusion-based method that has the advantage of smooth spreading which preserves the integrity of the original placement. Furthermore, the mathematics of the diffusion process is a well studied field, with a wide body of formal methods and techniques available to draw upon. Our algorithm takes advantage of a discrete approximation to a closed form solution of the continuous diffusion problem formulation. This approach can address the problem of post placement optimization for objectives such as timing, routing congestion, signal integrity, and heat distribution. Our algorithm is also potentially useful as a generic spreading technique to be used in conjunction with analytic or forcedirected placement algorithms. In this paper we use the diffusion algorithm to address the problem of placement legalization. Our experimental results show significant improvements in wire length and timing as compared to other commonly used techniques. 1
Recommended from our members
Incremental placement for modern VLSI design closure
textThe nature of multiple objectives and incremental design process for modern VLSI
design closure demands advanced incremental placement techniques. In this dissertation,
I proposed several novel incremental placement methods for design closure
objectives such as timing, signal integrity, legalization, and total wirelength (TWL).
These methods can be applied to any physical synthesis system. First technique is
sensitivity based netweighting. The objective is to improve both worst negative slack
(WNS) and figure of merit (FOM), defined as the total slack difference compared to a
certain slack threshold for all timing end points. It performs incremental global placements
with netweights based on comprehensive analysis of the wirelength, slack and
FOM sensitivities to the netweight. The experiments show promising results for both
stand-alone timing driven placement and physical synthesis afterwards. The second
technique is noise map driven two-step incremental placement. The novel noise map
is used to estimate the placement impact on coupling noise, which takes into account
of coupling capacitance, driver resistance and wire resistance. Guided by this accurate
noise map, it performs a two-step incremental placement, i.e. cell inflation and
local refinement, to expand regions with high noise impact in order to reduce total
noise. Experimental results show significant timing and noise improvement with no
wirelength penalty or CPU overhead. The third, yet most promising, technique is
diffusion based placement migration, which is the smooth movement of cells in an
existing placement to address a variety of post placement design closure issues. This
method simulates a diffusion process where cells move from high concentration area
to low concentration area. The application on placement legalization shows signifi-
cant improvements in wirelength and timing as compared to other commonly used
legalization techniques. The fourth technique is the first-do-no-harm detailed placement.
It uses a set of pin-based timing and electrical constraints to prevent detailed
placement techniques from degrading timing or violating electrical constraints while
reducing wirelength. The experimental results show that this detailed placement
technique not only reduces TWL, but also significantly improves timing.Electrical and Computer Engineerin