Search CORE

1,652 research outputs found

On-Chip Transparent Wire Pipelining (invited paper)

Author: Casu Mario Roberto
Macchiarulo Luca
Publication venue: IEEE Computer Society
Publication date: 01/01/2004
Field of study

Wire pipelining has been proposed as a viable mean to break the discrepancy between decreasing gate delays and increasing wire delays in deep-submicron technologies. Far from being a straightforwardly applicable technique, this methodology requires a number of design modifications in order to insert it seamlessly in the current design flow. In this paper we briefly survey the methods presented by other researchers in the field and then we thoroughly analyze the solutions we recently proposed, ranging from system-level wire pipelining to physical design aspects

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Algorithms for Circuit Sizing in VLSI Design

Author: Schorr Ulrike Elisabeth
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

One of the key problems in the physical design of computer chips, also known as integrated circuits, consists of choosing a physical layout for the logic gates and memory circuits (registers) on the chip. The layouts have a high influence on the power consumption and area of the chip and the delay of signal paths. A discrete set of predefined layouts for each logic function and register type with different physical properties is given by a library. One of the most influential characteristics of a circuit defined by the layout is its size. In this thesis we present new algorithms for the problem of choosing sizes for the circuits and its continuous relaxation, and evaluate these in theory and practice. A popular approach is based on Lagrangian relaxation and projected subgradient methods. We show that seemingly heuristic modifications that have been proposed for this approach can be theoretically justified by applying the well-known multiplicative weights algorithm. Subsequently, we propose a new model for the sizing problem as a min-max resource sharing problem. In our context, power consumption and signal delays are represented by resources that are distributed to customers. Under certain assumptions we obtain a polynomial time approximation for the continuous relaxation of the sizing problem that improves over the Lagrangian relaxation based approach. The new resource sharing algorithm has been implemented as part of the BonnTools software package which is developed at the Research Institute for Discrete Mathematics at the University of Bonn in cooperation with IBM. Our experiments on the ISPD 2013 benchmarks and state-of-the-art microprocessor designs provided by IBM illustrate that the new algorithm exhibits more stable convergence behavior compared to a Lagrangian relaxation based algorithm. Additionally, better timing and reduced power consumption was achieved on almost all instances. A subproblem of the new algorithm consists of finding sizes minimizing a weighted sum of power consumption and signal delays. We describe a method that approximates the continuous relaxation of this problem in polynomial time under certain assumptions. For the discrete problem we provide a fully polynomial approximation scheme under certain assumptions on the topology of the chip. Finally, we present a new algorithm for timing-driven optimization of registers. Their sizes and locations on a chip are usually determined during the clock network design phase, and remain mostly unchanged afterwards although the timing criticalities on which they were based can change. Our algorithm permutes register positions and sizes within so-called clusters without impairing the clock network such that it can be applied late in a design flow. Under mild assumptions, our algorithm finds an optimal solution which maximizes the worst cluster slack. It is implemented as part of the BonnTools and improves timing of registers on state-of-the-art microprocessor designs by up to 7.8% of design cycle time. </div

bonndoc – Der Publikationsserver der Universität Bonn

Concurrent optimization strategies for high-performance VLSI circuits

Author: Jiang Yanbin
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1998
Field of study

In the next generation of VLSI circuits, concurrent optimizations will be essential to achieve the performance challenges. In this dissertation, we present techniques for combining traditional timing optimization techniques to achieve a superior performance;The method of buffer insertion is used in timing optimization to either increase the driving power of a path in a circuit, or to isolate large capacitive loads that lie on noncritical or less critical paths. The procedure of transistor sizing selects the sizes of transistors within a circuit to achieve a given timing specification. Traditional design techniques perform these two optimizations as independent steps during synthesis, even though they are intimately linked and performing them in alternating steps is liable to lead to suboptimal solutions. The first part of this thesis presents a new approach for unifying transistor sizing with buffer insertion. Our algorithm achieve from 5% to 49% area reduction compared with the results of a standard transistor sizing algorithm;The next part of the thesis deals with the problem of collapsing gates for technology mapping. Two new techniques are proposed. The first method, the odd-level transistor replacement (OTR) method, performs technology mapping without the restriction of a fixed library size, and maps a circuit to a virtual library of complex static CMOS gates. The second technique, the Static CMOS/PTL method, uses a mix of static CMOS and pass transistor logic (PTL) to realize the circuit, using the relation between PTL and binary decision diagrams. The methods are very efficient and can handle all ISCAS\u2785 benchmark circuits in minutes. On average, it was found that the OTR method gave 40%, and the Static/PTL gave 50% delay reductions over SIS, with substantial area savings;Finally, we extend the technology mapping work to interleave it with placement in a single optimization. Conventional methods that perform these steps separately will not be adequate for next-generation circuits. Our approach presents an integrated solution to this problem, and shows an average of 28.19%, and a maximum of 78.42% improvement in the delay over a method that performs the two optimizations in separate steps

Digital Repository @ Iowa State University (ISU)

Recommended from our members

Automatic synthesis of analog layout : a survey

Author: Rentmeesters Mark J.
Publication venue: eScholarship, University of California
Publication date: 02/08/1990
Field of study

A review of recent research in the automatic synthesis of physical geometry for analog integrated circuits is presented. On introduction, an explanation of the difficulties involved in analog layout as opposed to digital layout is covered. Review of the literature then follows. Emphasis is placed on the exposition of general methods for addressing problems specific to analog layout, with the details of specific systems only being given when they surve to illustrate these methods well. The conclusion discusses problems remaining and offers a prediction as to how technology will evolve to solve them. It is argued that although progress has been and will continue to be made in the automation of analog IC layout, due to fundamental differences in the nature of analog IC design as opposed to digital design, it should not be expected that the level of automation of the former will reach that of the latter any time soon

eScholarship - University of California

Noise-Constrained Performance Optimization by Simultaneous Gate and Wire Sizing Based on Lagrangian Relaxation

Author: Hui-ru Jiang
Jing-yang Jou
Yao-wen Chang
Publication venue
Publication date: 01/01/1999
Field of study

Noise, as well as area, delay, and power, is one of the most important concerns in the design of deep sub-micron ICs. Currently existing algorithms can not handle simultaneous switching conditions of signals for noise minimization. In this paper, we model not only physical coupling capacitance, but also simultaneous switching behavior for noise optimization. Based on Lagrangian relaxation, we present an algorithm that can optimally solve the simultaneous noise, area, delay, and power optimization problem by sizing circuit components. Our algorithm, with linear memory requirement overall and linear runtime per iteration, is very effective and efficient. For example, for a circuit of 6144 wires and 3512 gates, our algorithm solves the simultaneous optimization problem using only 2.1 MB memory and 47 minute runtime to achieve the precision of within 1% error on a SUN UltraSPARC-I workstation

CiteSeerX

Crossref

Interconnect delay modeling under exponential input

Author: Vembu Rajesh K
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2003
Field of study

Interconnect has become the dominating factor in determining the performance of VLSI deep submicron designs. With the rapid shrinking of feature size and development in the process technologies, it has been observed that the resistance per unit length of the interconnect continues to increase, capacitance per unit length remains roughly constant, and transistor or gate delay continues to decrease. This had led to the increasing dominance of interconnect delay over logic delay, and this trend is expected to continue. With this being the main bottleneck in realizing high speed circuits, complete understanding of the interconnect delay and thereby efficient and accurate delay circulation has assumed a greater significance in physical design, optimization and fast verification. In this thesis, a interconnect delay model under exponential input is presented. Because of its simple closed form expression, fast computation speed, and fidelity with respect to simulation, Elmore delay model remains popular. More accurate delay computation methods are typically central processing unit intensive and/or difficult to implement. To bridge this gap between accuracy and efficiency/simplicity, a new RC delay metric for interconnects which is as efficient as the Elmore metric, but more accurate, is proposed. However, there is no interconnect delay model considering exponential input waveform existing in the literature. The proposed Exponential Delay Metric uses exponential waveform as input and captures resistive shielding effects by modeling the downstream by a [pi]-model. An application of the delay model to perform interconnect optimization using wire sizing is also presented. Experimental results show that the proposed delay model is significantly more accurate than the existing interconnect delay models

Digital Repository @ Iowa State University (ISU)

Some Applications of the Weighted Combinatorial Laplacian

Author: Szegedy Christian
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

The weighted combinatorial Laplacian of a graph is a symmetric matrix which is the discrete analogue of the Laplacian operator. In this thesis, we will study a new application of this matrix to matching theory yielding a new characterization of factor-criticality in graphs and matroids. Other applications are from the area of the physical design of very large scale integrated circuits. The placement of the gates includes the minimization of a quadratic form given by a weighted Laplacian. A method based on the dual constrained subgradient method is proposed to solve the simultaneous placement and gate-sizing problem. A crucial step of this method is the projection to the flow space of an associated graph, which can be performed by minimizing a quadratic form given by the unweighted combinatorial Laplacian.Andwendungen der gewichteten kombinatorischen Laplace-Matrix Die gewichtete kombinatorische Laplace-Matrix ist das diskrete Analogon des Laplace-Operators. In dieser Arbeit stellen wir eine neuartige Charakterisierung von Faktor-Kritikalität von Graphen und Matroiden mit Hilfe dieser Matrix vor. Wir untersuchen andere Anwendungen im Bereich des Entwurfs von höchstintegrierten Schaltkreisen. Die Platzierung basiert auf der Minimierung einer quadratischen Form, die durch eine gewichtete kombinatorische Laplace-Matrix gegeben ist. Wir präsentieren einen Algorithmus für das allgemeine simultane Platzierungs- und Gattergrößen-Optimierungsproblem, der auf der dualen Subgradientenmethode basiert. Ein wichtiger Bestandteil dieses Verfahrens ist eine Projektion auf den Flussraum eines assoziierten Graphen, die als die Minimierung einer durch die Laplace-Matrix gegebenen quadratischen Form aufgefasst werden kann

bonndoc – Der Publikationsserver der Universität Bonn

Recommended from our members

Nanometer VLSI placement and optimization for multi-objective design closure

Author: Luo Tao, Ph. D.
Publication venue
Publication date: 01/12/2007
Field of study

In a VLSI physical synthesis flow, placement directly defines the interconnection, which affects many other design objectives, such as timing, power consumption, congestion, and thermal issues. With the scaling of technology, the relative interconnect delay increases dramatically. As a result, placement has become a bottleneck in deep sub-micron physical synthesis. In this dissertation, I propose several optimization algorithms from global placement, placement migration, timing driven placements, to incremental power optimizations for multi-objective VLSI design closure. The first work is DPlace, a new global placement algorithm that scales well to the modern large-scale circuit placement problems. DPlace simulates the natural diffusion process to spread cells smoothly over the placement region, and uses both analytical and discrete techniques to improve the wire length. However, global placement is never sufficient for multi-objective design closure, a variety of design objectives have to be improved incrementally, such as timing, routing congestion, signal integrity, and heat distribution. Placement migration is a critical step to address the cell overlaps appearing during incremental optimizations. To achieve high placement stability, I propose a computational geometry based placement migration flow to cope with placement changes, and a new stability metric to measure the “similarity” between two placements accurately. Our placement migration algorithm has clear advantage over conventional legalization algorithms such that the neighborhood characteristics of the original placement are preserved. For timing closure in high performance designs, I present a linear programming based incremental timing driven placement to improve the timing on critical paths directly. I further present an efficient timing driven placement algorithm (Pyramids). Two formulations of Pyramids are proposed, which are suitable for different optimization stages in a physical synthesis flow. Both approaches find the optimal location for timing of a cell in constant time, through computational geometry based approaches. For fast convergence of design closure, placement should be integrated with other optimization techniques. I propose to combine placement, gate sizing and Vt swapping techniques to reduce the total power consumption, especially the leakage power, which is becoming increasingly critical for nanometer VLSI design closure.Electrical and Computer Engineerin

Texas ScholarWorks

Timing Closure in Chip Design

Author: Held Stephan
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips

bonndoc – Der Publikationsserver der Universität Bonn

Physical design algorithms for asynchronous circuits

Author: Wu Gang
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

Asynchronous designs have been demonstrated to be able to achieve both higher performance and lower power compared with their synchronous counterparts. It provides a very promising solution to the emerging challenges in advanced technology. However, due to the lack of proper EDA tool support, the design cycle for asynchronous circuits is much longer compared with the one for synchronous circuits. Thus, even with many advantages, asynchronous circuits are still not the mainstream in the industry. In this thesis, we provides several algorithms to resolve the emerging issues for the physical design of asynchronous circuits. Our proposed algorithms optimize asynchronous circuits using placement, gate sizing, repeater insertion and pipeline buffer insertion techniques. An incremental maximum cycle ratio algorithm is also proposed to speed up the timing analysis of asynchronous circuits

Digital Repository @ Iowa State University (ISU)