119 research outputs found
Shortest Paths and Steiner Trees in VLSI Routing
Routing is one of the major steps in very-large-scale integration (VLSI) design. Its task is to find disjoint wire connections between sets of points on a chip, subject to numerous constraints. This problem is solved in a two-stage approach, which consists of so-called global and detailed routing steps. For each set of metal components to be connected, global routing reduces the search space by computing corridors in which detailed routing sequentially determines the desired connections as shortest paths. In this thesis, we present new theoretical results on Steiner trees and shortest paths, the two main mathematical concepts in routing. In the practical part, we give computational results of BonnRoute, a VLSI routing tool developed at the Research Institute for Discrete Mathematics at the University of Bonn. Interconnect signal delays are becoming increasingly important in modern chip designs. Therefore, the length of paths or direct delay measures should be taken into account when constructing rectilinear Steiner trees. We consider the problem of finding a rectilinear Steiner minimum tree (RSMT) that --- as a secondary objective --- minimizes a signal delay related objective. Given a source we derive some structural properties of RSMTs for which the weighted sum of path lengths from the source to the other terminals is minimized. Also, we present an exact algorithm for constructing RSMTs with weighted sum of path lengths as secondary objective, and a heuristic for various secondary objectives. Computational results for industrial designs are presented. We further consider the problem of finding a shortest rectilinear Steiner tree in the plane in the presence of rectilinear obstacles. The Steiner tree is allowed to run over obstacles; however, if it intersects an obstacle, then no connected component of the induced subtree must be longer than a given fixed length. This kind of length restriction is motivated by its application in VLSI routing where a large Steiner tree requires the insertion of repeaters which must not be placed on top of obstacles. We show that there are optimal length-restricted Steiner trees with a special structure. In particular, we prove that a certain graph (called augmented Hanan grid) always contains an optimal solution. Based on this structural result, we give an approximation scheme for the special case that all obstacles are of rectangular shape or are represented by at most a constant number of edges. Turning to the shortest paths problem, we present a new generic framework for Dijkstra's algorithm for finding shortest paths in digraphs with non-negative integral edge lengths. Instead of labeling individual vertices, we label subgraphs which partition the given graph. Much better running times can be achieved if the number of involved subgraphs is small compared to the order of the original graph and the shortest path problems restricted to these subgraphs is computationally easy. As an application we consider the VLSI routing problem, where we need to find millions of shortest paths in partial grid graphs with billions of vertices. Here, the algorithm can be applied twice, once in a coarse abstraction (where the labeled subgraphs are rectangles), and once in a detailed model (where the labeled subgraphs are intervals). Using the result of the first algorithm to speed up the second one via goal-oriented techniques leads to considerably reduced running time. We illustrate this with the routing program BonnRoute on leading-edge industrial chips. Finally, we present computational results of BonnRoute obtained on real-world VLSI chips. BonnRoute fulfills all requirements of modern VLSI routing and has been used by IBM and its customers over many years to produce more than one thousand different chips. To demonstrate the strength of BonnRoute as a state-of-the-art industrial routing tool, we show that it performs excellently on all traditional quality measures such as wire length and number of vias, but also on further criteria of equal importance in the every-day work of the designer
Design and modelling of variability tolerant on-chip communication structures for future high performance system on chip designs
The incessant technology scaling has enabled the integration of functionally complex System-on-Chip (SoC) designs with a large number of heterogeneous systems on a single chip. The processing elements on these chips are integrated through on-chip communication structures which provide the infrastructure necessary for the exchange of data and control signals, while meeting the strenuous physical and design constraints. The use of vast amounts of on chip communications will be central to future designs where variability is an inherent characteristic. For this reason, in this thesis we investigate the performance and variability tolerance of typical on-chip communication structures. Understanding of the relationship between variability and communication is paramount for the designers; i.e. to devise new methods and techniques for designing performance and power efficient communication circuits in the forefront of challenges presented by deep sub-micron (DSM) technologies.
The initial part of this work investigates the impact of device variability due to Random Dopant Fluctuations (RDF) on the timing characteristics of basic communication elements. The characterization data so obtained can be used to estimate the performance and failure probability of simple links through the methodology proposed in this work. For the Statistical Static Timing Analysis (SSTA) of larger circuits, a method for accurate estimation of the probability density functions of different circuit parameters is proposed. Moreover, its significance on pipelined circuits is highlighted. Power and area are one of the most important design metrics for any integrated circuit (IC) design. This thesis emphasises the consideration of communication reliability while optimizing for power and area. A methodology has been proposed for the simultaneous optimization of performance, area, power and delay variability for a repeater inserted interconnect. Similarly for multi-bit parallel links, bandwidth driven optimizations have also been performed. Power and area efficient semi-serial links, less vulnerable to delay variations than the corresponding fully parallel links are introduced. Furthermore, due to technology scaling, the coupling noise between the link lines has become an important issue. With ever decreasing supply voltages, and the corresponding reduction in noise margins, severe challenges are introduced for performing timing verification in the presence of variability. For this reason an accurate model for crosstalk noise in an interconnection as a function of time and skew is introduced in this work. This model can be used for the identification of skew condition that gives maximum delay noise, and also for efficient design verification
Initial detailed routing algorithms
In this work, we present a study of the problem of routing in the context of the VLSI physical synthesis flow. We study the fundamental routing algorithms such as maze routing, A*, and Steiner tree-based algorithms, as well as some global routing algorithms, namely FastRoute 4.0 and BoxRouter 2.0. We dissect some of the major state of the art initial detailed routing tools, such as RegularRoute, TritonRoute, SmartDR and Dr.CU 2.0. We also propose an initial detailed routing flow, and present an implementation of the proposed routing flow, with a track assignment technique that models the problem as an instance of the maximum independent weighted set (MWIS) and utilizes integer linear programming (ILP) as a solver. The implementation of the proposed initial detailed routing flow also includes an implementation of multiple-source and multiple-target A* for terminal andnet connection with adjustable rules and weights. Finally, we also present a study of the results obtained by the implementation of the proposed initial detailed routing flow and a comparison with the ISPD 2019 contest winners, considering the ISPD 2019 and benchmark suite and evaluation tools.Neste trabalho, apresentamos um estudo do problema de roteamento no contexto do fluxo de síntese física de circuitos integrados VLSI. Nós estudamos algoritmos de roteamento fundamentais como roteamento de labirinto, A* e baseados em árvores de Steiner, além de alguns algoritmos de roteamento global como FastRoute 4.0 e BoxRouter 2.0. Nós dissecamos alguns dos principais trabalhos de roteamento detalhado inicial do estado da arte, como RegularRoute, TritonRoute, SmartDR e Dr.CU 2.0. Também propomos um fluxo de roteamento detalhado inicial, e apresentamos uma implementação do fluxo de roteametno proposto, com uma técnica de assinalamento de trilhas que modela o problema como uma instância do problema do conjunto independente de peso máximo e usa programação linear inteira como um resolvedor. A implementação do fluxo de rotemaento detalhado inicial proposto também inclui uma implementação de um A* com múltiplas fontes e múltiplos destinos para conexão de terminais e redes, com regras e pesos ajustáveis. Por fim, nós apresentamos um estudo dos resultados obtidos pela implementação do fluxo de roteamento detalhado inicial proposto e comparamos com os vencedores do ISPD 2019 contest considerando a suíte de teste e ferramentas de avaliação do ISPD 2019
Recommended from our members
Analysis and design on low-power multi-Gb/s serial links
High speed serial links are critical components for addressing the growing demand for I/O bandwidth in next-generation computing applications, such as many-core systems, backplane and optical data communications. Due to continued process scaling and circuit innovations, today's CMOS serial link transceivers can achieve tens of Gb/s per pin. However, most of their reported power efficiency improves much slower than the rise of data rate. Therefore, aggregate I/O power is increasing and will exceed the power budget if the trend for more off-chip bandwidth is sustained.
In this work, a system level statistical analysis of serial links is first described, and compares the link performance of Non-Return-to-Zero (2-PAM) with higher-order modulation (duobinary) signaling schemes. This method enables fast and accurate BER distribution simulation of serial link transceivers that include channel and circuit imperfections, such as finite pulse rise/fall time, duty cycle variation, and both receiver and transmitter forwarded-clock jitter.
Second, in order to address link power efficiency, two test chips have been implemented. The first one describes a quad-lane, 6.4-7.2 Gb/s serial link receiver prototype using a forwarded clock architecture. A novel phase deskew scheme using injection-locked ring oscillators (ILRO) is proposed that achieves greater than one UI of phase shift for multiple clock phases, eliminating phase rotation and interpolation required in conventional architectures. Each receiver, optimized for power efficiency, consists of a low-power linear equalizer, four offset-cancelled quantizers for 1:4 demultiplexing, and an injection-locked ring oscillator coupled to a low-voltage swing, global clock distribution. Measurement results show a 6.4-7.2Gb/s data rate with BER < 10⁻¹² across 14 cm of PCB, and an 8Gb/s data rate through 4cm of PCB. Designed in a 1.2V, 90nm CMOS process, the ILRO achieves a wide tuning range from 1.6-2.6GHz. The total area of each receiver is 0.0174mm², resulting in a measured power efficiency of 0.6mW/Gb/s.
Improving upon the first test chip, a second test chip for 8Gb/s forwarded clock serial link receivers exploits a low-power super-harmonic injection-locked ring oscillator for symmetric multi-phase local clock generation and deskewing. Further power reduction is achieved by designing most of the receiver circuits in the near-threshold region (0.6V supply), with the exception of only the global clock buffer, test buffers and synthesized digital test circuits at nominal 1V supply. At the architectural level, a 1:10 direct demultiplexing rate is chosen to achieve low supply operation by exploiting high-parallelism. Fabricated in 65nm CMOS technology, two receiver prototypes are integrated in this test chip, one without and the other with front-end boot-strapped S/Hs. Including the amortized power of global clock distribution, the proposed serial link receivers consume 1.3mW and 2mW respectively at 8Gb/s input data rate, achieving a power efficiency of 0.163mW/Gb/s and 0.25mW/Gb/s. Measurement results show both receivers achieve BER < 10⁻¹² across a 20-cm FR4 PCB channel
Design of complex integrated systems based on networks-on-chip: Trading off performance, power and reliability
The steady advancement of microelectronics is associated with an escalating number of challenges for design engineers due to both the tiny dimensions and the enormous complexity of integrated systems. Against this background, this work deals with Network-On-Chip (NOC) as the emerging design paradigm to cope with diverse issues of nanotechnology. The detailed investigations within the chapters focus on the communication-centric aspects of multi-core-systems, whereas performance, power consumption as well as reliability are considered likewise as the essential design criteria
Hardware Architectures and Implementations for Associative Memories : the Building Blocks of Hierarchically Distributed Memories
During the past several decades, the semiconductor industry has grown into a global industry with revenues around $300 billion. Intel no longer relies on only transistor scaling for higher CPU performance, but instead, focuses more on multiple cores on a single die. It has been projected that in 2016 most CMOS circuits will be manufactured with 22 nm process. The CMOS circuits will have a large number of defects. Especially when the transistor goes below sub-micron, the original deterministic circuits will start having probabilistic characteristics. Hence, it would be challenging to map traditional computational models onto probabilistic circuits, suggesting a need for fault-tolerant computational algorithms. Biologically inspired algorithms, or associative memories (AMs)—the building blocks of cortical hierarchically distributed memories (HDMs) discussed in this dissertation, exhibit a remarkable match to the nano-scale electronics, besides having great fault-tolerance ability. Research on the potential mapping of the HDM onto CMOL (hybrid CMOS/nanoelectronic circuits) nanogrids provides useful insight into the development of non-von Neumann neuromorphic architectures and semiconductor industry. In this dissertation, we investigated the implementations of AMs on different hardware platforms, including microprocessor based personal computer (PC), PC cluster, field programmable gate arrays (FPGA), CMOS, and CMOL nanogrids.
We studied two types of neural associative memory models, with and without temporal information. In this research, we first decomposed the computational models into basic and common operations, such as matrix-vector inner-product and k-winners-take-all (k-WTA). We then analyzed the baseline performance/price ratio of implementing the AMs with a PC. We continued with a similar performance/price analysis of the implementations on more parallel hardware platforms, such as PC cluster and FPGA. However, the majority of the research emphasized on the implementations with all digital and mixed-signal full-custom CMOS and CMOL nanogrids.
In this dissertation, we draw the conclusion that the mixed-signal CMOL nanogrids exhibit the best performance/price ratio over other hardware platforms. We also highlighted some of the trade-offs between dedicated and virtualized hardware circuits for the HDM models. A simple time-multiplexing scheme for the digital CMOS implementations can achieve comparable throughput as the mixed-signal CMOL nanogrids
Design of 5v Digital Standard Cells And I/O Libraries for Military Standard Temperatures
The scope of this research work is to develop digital standard cell and I/O cell libraries operable at 5V power supply and operable up to 125�C using Peregrine 0.5um 3.3 V process. Device geometries are selected based on Ion/Ioff ratios at 125�C. The cell schematic, layout and abstracted views are generated for both the libraries The Standard cell and I/O libraries are characterized for timing and power and the characterization data is realized in various formats compatible with logic synthesis and place and route tools. The pads have been tested for robustness to ESD. A tutorial on abstraction of standard cells and IO cells is prepared using the Cadence Abstract Generator.School of Electrical & Computer Engineerin
A Multiple-objective ILP based Global Routing Approach for VLSI ASIC Design
A VLSI chip can today contain hundreds of millions transistors and is expected to
contain more than 1 billion transistors in the next decade.
In order to handle this rapid growth in integration technology,
the design procedure is therefore divided into a sequence of design
steps. Circuit layout is the design step in which a physical
realization of a circuit is obtained from its functional description.
Global routing is one of the key subproblems of the circuit layout
which involves finding an approximate path for the wires connecting the
elements of the circuit without violating resource constraints.
The global routing problem is NP-hard, therefore, heuristics capable of
producing high quality routes with little computational effort are required
as we move into the Deep Sub-Micron (DSM) regime.
In this thesis, different approaches for global routing problem are first
reviewed. The advantages and disadvantages of these approaches are also summarized.
According to this literature review, several mathematical programming based global
routing models are fully investigated. Quality of solution obtained by
these models are then compared with traditional Maze routing technique.
The experimental results show that the proposed model can optimize several global routing
objectives simultaneously and effectively. Also, it is easy to incorporate new
objectives into the proposed global routing model.
To speedup the computation time of the proposed ILP based global router, several
hierarchical methods are combined with the flat ILP based global routing
approach. The experimental results indicate that the bottom-up global routing
method can reduce the computation time effectively with a slight increase of maximum
routing density.
In addition to wire area, routability, and vias, performance and low power
are also important goals in global routing, especially in deep submicron designs.
Previous efforts that focused on power optimization for global routing
are hindered by excessively long run times or the routing of a subset of the
nets. Accordingly, a power efficient multi-pin global routing
technique (PIRT) is proposed in this thesis.
This integer linear programming based techniques strives to find a power
efficient global routing solution.
The results indicate that an average power savings as high as 32\% for the
130-nm technology can be achieved with no impact on the maximum chip frequency
Adaptive overcurrent protection application for a micro-grid system in South Africa
Abstract: The non-directional overcurrent protection (International Electrotechnical Commission standard IEC 617 or American National Standards Institute ANSI/Institute of Electrical and Electronic Engineers IEEE C37.2 standard device number 51) is one protection type/relay function that has stood the test of time. The latest generation of relays has brought about enhanced capabilities. The most popular overcurrent protection, which is the Inverse Definite Minimum Time (IDMT) function, has proven to provide coordination of electrical nodes with ease. This is one of the oldest but extremely reliable relay characteristic. A number of new protection functions and enhancements to existing functions are commensurate to the advanced technical capabilities of the newer generation protective devices. The new development techniques include “acceleration”, which is a technique of sending the circuit breaker status of the near end of a line or feeder to the far end to influence the relay decision at the far end. Impedance protection, unit line protection, etc. have come with many advanced characteristics and properties. The enhancements to protection devices bear special features but cannot substitute inverse time overcurrent protection, which, up to now, is a reliable backup in feeder protection schemes in South Africa. The superior feature is the capability to achieve coordination between a series of protective devices. This is achievable without excessive damage to the electrical components of the circuit...M.Ing. (Electrical Engineering
A complete design path for the layout of flexible macros
XIV+172hlm.;24c
- …