Search CORE

4,783 research outputs found

Course grained low power design flow using UPF

Author: Varanasi Archana
Publication venue: RIT Scholar Works
Publication date: 01/08/2009
Field of study

Increased system complexity has led to the substitution of the traditional bottom-up design flow by systematic hierarchical design flow. The main motivation behind the evolution of such an approach is the increasing difficulty in hardware realization of complex systems. With decreasing channel lengths, few key problems such as timing closure, design sign-off, routing complexity, signal integrity, and power dissipation arise in the design flows. Specifically, minimizing power dissipation is critical in several high-end processors. In high-end processors, the design complexity contributes to the overall dynamic power while the decreasing transistor size results in static power dissipation. This research aims at optimizing the design flow for power and timing using the unified power format (UPF). UPF provides a strategic format to specify power-aware design information at every stage in the flow. The low power reduction techniques enforced in this research are multi-voltage, multi-threshold voltage (Vth), and power gating with state retention. An inherent design challenge addressed in this research is the choice of power optimization techniques as the flow advances from synthesis to physical design. A top-down digital design flow for a 32 bit MIPS RISC processor has been implemented with and without UPF synthesis flow for 65nm technology. The UPF synthesis is implemented with two voltages, 1.08V and 0.864V (Multi-VDD). Area, power and timing metrics are analyzed for the flows developed. Power savings of about 20 % are achieved in the design flow with \u27multi-threshold\u27 power technique compared to that of the design flow with no low power techniques employed. Similarly, 30 % power savings are achieved in the design flow with the UPF implemented when compared to that of the design flow with \u27multi-threshold\u27 power technique employed. Thus, a cumulative power savings of 42% has been achieved in a complete power efficient design flow (UPF) compared to that of the generic top-down standard flow with no power saving techniques employed. This is substantiated by the low voltage operation of modules in the design, reduction in clock switching power by gating clocks in the design and extensive use of HVT and LVT standard cells for implementation. The UPF synthesis flow saw the worst timing slack and more area when compared to those of the `multi-threshold\u27 or the generic flow. Percentage increase in the area with UPF is approximately 15%; a significant source for this increase being the additional power controlling logic added

RIT Scholar Works

Design methodology and productivity improvement in high speed VLSI circuits

Author: Hossain KM Mozammel
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2017
Field of study

2017 Spring.Includes bibliographical references.To view the abstract, please see the full text of the document

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Algorithmic techniques for physical design : macro placement and under-the-cell routing

Author: Vidal Obiols Alexandre
Publication venue: Universitat Politècnica de Catalunya
Publication date: 24/01/2020
Field of study

With the increase of chip component density and new manufacturability constraints imposed by modern technology nodes, the role of algorithms for electronic design automation is key to the successful implementation of integrated circuits. Two of the critical steps in the physical design flows are macro placement and ensuring all design rules are honored after timing closure. This thesis proposes contributions to help in these stages, easing time-consuming manual steps and helping physical design engineers to obtain better layouts in reduced turnaround time. The first contribution is under-the-cell routing, a proposal to systematically connect standard cell components via lateral pins in the lower metal layers. The aim is to reduce congestion in the upper metal layers caused by extra metal and vias, decreasing the number of design rule violations. To allow cells to connect by abutment, a standard cell library is enriched with instances containing lateral pins in a pre-selected sharing track. Algorithms are proposed to maximize the numbers of connections via lateral connection by mapping placed cell instances to layouts with lateral pins, and proposing local placement modifications to increase the opportunities for such connections. Experimental results show a significant decrease in the number of pins, vias, and in number of design rule violations, with negligible impact on wirelength and timing. The second contribution, done in collaboration with eSilicon (a leading ASIC design company), is the creation of HiDaP, a macro placement tool for modern industrial designs. The proposed approach follows a multilevel scheme to floorplan hierarchical blocks, composed of macros and standard cells. By exploiting RTL information available in the netlist, the dataflow affinity between these blocks is modeled and minimized to find a macro placement with good wirelength and timing properties. The approach is further extended to allow additional engineer input, such as preferred macro locations, and also spectral and force methods to guide the floorplanning search. Experimental results show that the layouts generated by HiDaP outperforms those obtained by a state-of-the-art EDA physical design software, with similar wirelength and better timing when compared to manually designed tape-out ready macro placements. Layouts obtained by HiDaP have successfully been brought to near timing closure with one to two rounds of small modifications by physical design engineers. HiDaP has been fully integrated in the design flows of the company and its development remains an ongoing effort.A causa de l'increment de la densitat de components en els xip i les noves restriccions de disseny imposades pels últims nodes de fabricació, el rol de l'algorísmia en l'automatització del disseny electrònic ha esdevingut clau per poder implementar circuits integrats. Dos dels passos crucials en el procés de disseny físic és el placement de macros i assegurar la correcció de les regles de disseny un cop les restriccions de timing del circuit són satisfetes. Aquesta tesi proposa contribucions per ajudar en aquests dos reptes, facilitant laboriosos passos manuals en el procés i ajudant als enginyers de disseny físic a obtenir millors resultats en menys temps. La primera contribució és el routing "under-the-cell", una proposta per connectar cel·les estàndard usant pins laterals en les capes de metall inferior de manera sistemàtica. L'objectiu és reduir la congestió en les capes de metall superior causades per l'ús de metall i vies, i així disminuir el nombre de violacions de regles de disseny. Per permetre la connexió lateral de cel·les, estenem una llibreria de cel·les estàndard amb dissenys que incorporen connexions laterals. També proposem modificacions locals al placement per permetre explotar aquest tipus de connexions més sovint. Els resultats experimentals mostren una reducció significativa en el nombre de pins, vies i nombre de violacions de regles de disseny, amb un impacte negligible en wirelength i timing. La segona contribució, desenvolupada en col·laboració amb eSilicon (una empresa capdavantera en disseny ASIC), és el desenvolupament de HiDaP, una eina de macro placement per a dissenys industrials actuals. La proposta segueix un procés multinivell per fer el floorplan de blocks jeràrquics, formats per macros i cel·les estàndard. Mitjançant la informació RTL disponible en la netlist, l'afinitat de dataflow entre els mòduls es modela i minimitza per trobar macro placements amb bones propietats de wirelength i timing. La proposta també incorpora la possibilitat de rebre input addicional de l'enginyer, com ara suggeriments de les posicions de les macros. Finalment, també usa mètodes espectrals i de forçes per guiar la cerca de floorplans. Els resultats experimentals mostren que els dissenys generats amb HiDaP són millors que els obtinguts per eines comercials capdavanteres de EDA. Els resultats també mostren que els dissenys presentats poden obtenir un wirelength similar i millor timing que macro placements obtinguts manualment, usats per fabricació. Alguns dissenys obtinguts per HiDaP s'han dut fins a timing-closure en una o dues rondes de modificacions incrementals per part d'enginyers de disseny físic. L'eina s'ha integrat en el procés de disseny de eSilicon i el seu desenvolupament continua més enllà de les aportacions a aquesta tesi.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Timing Closure in Chip Design

Author: Held Stephan
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips

bonndoc – Der Publikationsserver der Universität Bonn

Recommended from our members

Layer assignment and routing optimization for advanced technologies

Author: Liu Derong, (Ph. D. in electrical and computer engineering),
Publication venue
Publication date: 23/10/2018
Field of study

As VLSI technology scales to deep sub-micron and beyond, it becomes increasingly challenging to achieve timing closure for VLSI design. Since a complete design flow consists of several phases, such as logic synthesis, placement, and routing, interconnect synthesis plays an important role which includes buffer insertion/sizing and timing-driven routing. Although progress has been achieved by many advanced routing techniques, the following aspects can be exploited sufficiently for further improvement: (1) incremental layer assignment for timing optimization; (2) signal routing with the requirement of regularity; (3) power-efficient optical-electrical interconnect paradigm. Thus, to perform the layer assignment and routing optimization for advanced technologies, an automated routing engine in a global view is essential to benefit the interconnect design while satisfying specific requirements. This dissertation proposes a set of algorithms and methodology on layer assignment and routing optimization for advanced technologies. The research includes two timing-driven incremental layer assignment approaches, synergistic topology generation and routing synthesis for signal groups, and optical-electrical routing design for power efficiency. For incremental layer assignment, most of the conventional approaches target via minimization but neglect the timing issues. Meanwhile, via delays are ignored but should be considered in emerging technology nodes. Then two timing-driven incremental layer assignment frameworks are proposed, where all the nets are solved simultaneously with the integration of via delays: (1) optimization of the total sum of net delays and reduction of slew violations; (2) minimization of critical path timing in selected nets. For on-chip signal routing, the bundled bits in one group may have different pin locations, but they have to be routed in a regular manner by sharing common topologies. Very few previous works target inter-bit regularity via multi-layer topology selection. Furthermore, the routability and wire-length of the signal bits should also be optimized. Then an advanced synergistic routing engine is promoted, which is able to not only control routability and wire-length but also guide each bit routing intelligently for design regularity. For optical-electrical co-design routing, optical interconnect shows its advantage due to the dominance of bandwidth-distance-power properties. The previous works lack a detailed exploration of optical-electrical co-design for on-chip interconnects. During the transmission, signal quality can be affected by various loss sources and Electrical to Optical (EO)/Optical to Electrical (OE) conversion overheads should also be considered. Then a power-efficient routing flow for on-chip signals is presented, where optical connections can collaborate with electrical wires seamlessly. The effectiveness of proposed algorithms and techniques is demonstrated in this dissertation. These approaches are able to achieve the improvements regarding specific metrics and eventually benefit the routing flow.Electrical and Computer Engineerin

Texas ScholarWorks

Algorithmic techniques for physical design : macro placement and under-the-cell routing

Author: Vidal Obiols Alexandre
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2020
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Physical Design and Clock Tree Synthesis Methods For A 8-Bit Processor

Author: Daxiniray Debaprasad
Publication venue
Publication date: 01/01/2016
Field of study

Now days a number of processors are available with a lot kind of feature from different industries. A processor with similar kind of architecture of the current processors only missing the memory stuffs like the RAM and ROM has been designed here with the help of Verilog style of coding. This processor contains architecturally the program counter, instruction register, ALU, ALU latch, General Purpose Registers, control state module, flag registers and the core module containing all the modules. And a test module is designed for testing the processor. After the design of the processor with successful functionality, the processor is synthesized with 180nm technology. The synthesis is performed with the data path optimization like the selection of proper adders and multipliers for timing optimization in the data path while the ALU operations are performed. During synthesis how to take care of the worst negative slack (WNS), how to include the clock gating cells, how to define the cost and path groups etc. have been covered. After the proper synthesis we get the proper net list and the synthesized constraint file for carrying out the physical design. In physical design the steps like floor-planning, partitioning, placement, legalization of the placement, clock tree synthesis, and routing etc. have been performed. At all the stages the static timing analysis is performed for the timing meet of the design for better performance in terms of timing or frequency. Each steps of physical design are discussed with special effort towards the concepts behind the step. Out of all the steps of physical design the clock tree synthesis is performed with some improvement in the performance of the clock tree by creating a symmetrical clock tree and maintaining more common clock paths. A special algorithm has been framed for creating a symmetrical clock tree and thereby making the power consumption of the clock tree low

ethesis@nitr

Physical design of USB1.1

Author: Koti Boddu Siva
Publication venue
Publication date: 01/01/2016
Field of study

In earlier days, interfacing peripheral devices to host computer has a big problematic. There existed so many different kinds’ ports like serial port, parallel port, PS/2 etc. And their use restricts many situations, Such as no hot-pluggability and involuntary configuration. There are very less number of methods to connect the peripheral devices to host computer. The main reason that Universal Serial Bus was implemented to provide an additional benefits compared to earlier interfacing ports. USB is designed to allow many peripheral be connecting using single standardize interface. It provides an expandable fast, cost effective, hot-pluggable plug and play serial hardware interface that makes life of computer user easier allowing them to plug different devices to into USB port and have them configured automatically. In this thesis demonstrated the USB v1.1 architecture part in briefly and generated gate level net list form RTL code by applying the different constraints like timing, area and power. By applying the various types design constraints so that the performance was improved by 30%. And then it implemented in physically by using SoC encounter EDI system, estimation of chip size, power analysis and routing the clock signal to all flip-flops presented in the design. To reduce the clock switching power implemented register clustering algorithm (DBSCAN). In this design implementation TSMC 180nm technology library is used

ethesis@nitr