12 research outputs found

    Placement driven retiming with a coupled edge timing model

    Get PDF
    Retiming is a widely investigated technique for performance optimization. It performs powerful modifications on a circuit netlist. However, often it is not clear, whether the predicted performance improvement will still be valid after placement has been performed. This paper presents a new retiming algorithm using a highly accurate timing model taking into account the effect of retiming on capacitive loads of single wires as well as fanout systems. We propose the integration of retiming into a timing-driven standard cell placement environment based on simulated annealing. Retiming is used as an optimization technique throughout the whole placement process. The experimental results show the benefit of the proposed approach. In comparison with the conventional design flow based on standard FEAS our approach achieved an improvement in cycle time of up to 34% and 17% on the average

    Retiming with wire delay and post-retiming register placement.

    Get PDF
    Tong Ka Yau Dennis.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 77-81).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivations --- p.1Chapter 1.2 --- Progress on the Problem --- p.2Chapter 1.3 --- Our Contributions --- p.3Chapter 1.4 --- Thesis Organization --- p.4Chapter 2 --- Background on Retiming --- p.5Chapter 2.1 --- Introduction --- p.5Chapter 2.2 --- Preliminaries --- p.7Chapter 2.3 --- Retiming Problem --- p.9Chapter 3 --- Literature Review on Retiming --- p.10Chapter 3.1 --- Introduction --- p.10Chapter 3.2 --- The First Retiming Paper --- p.11Chapter 3.2.1 --- """Retiming Synchronous Circuitry""" --- p.11Chapter 3.3 --- Important Extensions of the Basic Retiming Algorithm --- p.14Chapter 3.3.1 --- """A Fresh Look at Retiming via Clock Skew Optimization""" --- p.14Chapter 3.3.2 --- """An Improved Algorithm for Minimum-Area Retiming""" --- p.16Chapter 3.3.3 --- """Efficient Implementation of Retiming""" --- p.17Chapter 3.4 --- Retiming in Physical Design Stages --- p.19Chapter 3.4.1 --- """Physical Planning with Retiming""" --- p.19Chapter 3.4.2 --- """Simultaneous Circuit Partitioning/Clustering with Re- timing for Performance Optimization" --- p.20Chapter 3.4.3 --- """Performance Driven Multi-level and Multiway Parti- tioning with Retiming" --- p.22Chapter 3.5 --- Retiming with More Sophisticated Timing Models --- p.23Chapter 3.5.1 --- """Retiming with Non-zero Clock Skew, Variable Register, and Interconnect Delay""" --- p.23Chapter 3.5.2 --- """Placement Driven Retiming with a Coupled Edge Tim- ing Model""" --- p.24Chapter 3.6 --- Post-Retiming Register Placement --- p.26Chapter 3.6.1 --- """Layout Driven Retiming Using the Coupled Edge Tim- ing Model""" --- p.26Chapter 3.6.2 --- """Integrating Logic Retiming and Register Placement""" --- p.27Chapter 4 --- Retiming with Gate and Wire Delay [2] --- p.29Chapter 4.1 --- Introduction --- p.29Chapter 4.2 --- Problem Formulation --- p.30Chapter 4.3 --- Optimal Approach [2] --- p.31Chapter 4.3.1 --- Original Mathematical Framework for Retiming --- p.31Chapter 4.3.2 --- A Modified Optimal Approach --- p.33Chapter 4.4 --- Near-Optimal Fast Approach [2] --- p.37Chapter 4.4.1 --- Considering Wire Delay Only --- p.38Chapter 4.4.2 --- Considering Both Gate and Wire Delay --- p.42Chapter 4.4.3 --- Computational Complexity --- p.43Chapter 4.4.4 --- Experimental Results --- p.44Chapter 4.5 --- Lin's Optimal Approach [23] --- p.47Chapter 4.5.1 --- Theoretical Results --- p.47Chapter 4.5.2 --- Algorithm Description --- p.51Chapter 4.5.3 --- Computational Complexity --- p.52Chapter 4.5.4 --- Experimental Results --- p.52Chapter 4.6 --- Summary --- p.54Chapter 5 --- Register Insertion in Placement [36] --- p.55Chapter 5.1 --- Introduction --- p.55Chapter 5.2 --- Problem Formulation --- p.57Chapter 5.3 --- Placement of Registers After Retiming --- p.60Chapter 5.3.1 --- Topology Finding --- p.60Chapter 5.3.2 --- Register Placement --- p.69Chapter 5.4 --- Experimental Results --- p.71Chapter 5.5 --- Summary --- p.74Chapter 6 --- Conclusion --- p.75Bibliography --- p.7

    End-to-End Industrial Study of Retiming

    Get PDF
    Sequential circuits are combinational circuits that are separated by registers. Retiming is considered as the most promising technique for optimizing sequential circuits, that involves moving the edge-triggered registers across the combinational logic without changing the functionality. Despite significant efforts spent on sequential optimization since 1980's, there are few works discussed its performance in an end to-end design flow. The retiming algorithms were mostly evaluated at the logic level. However, it turns out that the retiming results at logic level could be significantly different than evaluating the physical level.This paper provides the findings of how retiming algorithms perform in an end-to-end industrial design flow, with seven industry designs taken from a recent 14nm microprocessor. Experiments are conducted with several complete industrial design flows. The evaluations are made at the end of the physical design flow. The experimental results show that the performance (design quality) of the retiming algorithms vary on the designs. Based these experimental results, we discover a feature that describes the retiming potentials of sequential designs. This model successfully forecast whether the given industrial designs could be significantly improved by retiming in an end-to-end design flow, regarding timing, area, and power

    An efficient incremental algorithm for min-area retiming

    Full text link
    As one of the most effective sequential optimization tech-niques, retiming is a structural transformation that relocates flip-flops in a circuit without changing its functionality. The min-area retiming problem seeks a solution with the mini-mum flip-flop area (or number) under a given clock period. Even though having polynomial runtime, the best existing algorithms for this problem still need to first construct a dense path graph and then find a min-cost network flow on it, thus incur huge storage and time expenses for large cir-cuits. Recently, provable incremental algorithms have been discovered for min-period retiming, and heuristic incremen-tal algorithms have been proposed for min-area retiming. However, given the complexity of the problem, min-area re-timing is still resisting an efficient provable incremental algo-rithm. In this paper, we fill the gap by presenting an efficient algorithm to solve the min-area retiming problem incremen-tally and optimally. Contrary to existing approaches, no dense path graph is constructed; only the active timing con-straints are dynamically generated in the algorithm. Exper-imental results show that the total runtime of our algorithm for all the benchmarks is at least 60 Ă— faster than the best existing approach

    Rewired retiming for flip-flop reduction and low power without delay penalty.

    Get PDF
    Jiang, Mingqi.Thesis (M.Phil.)--Chinese University of Hong Kong, 2009.Includes bibliographical references (leaves [49]-51).Abstract also in Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 2 --- Rewiring Background --- p.4Chapter 2.1 --- REWIRE --- p.6Chapter 2.2 --- GBAW --- p.7Chapter 3 --- Retiming --- p.9Chapter 3.1 --- Min-Clock Period Retiming --- p.9Chapter 3.2 --- Min-Area Retiming --- p.17Chapter 3.3 --- Retiming for Low Power --- p.18Chapter 3.4 --- Retiming with Interconnect Delay --- p.22Chapter 4 --- Rewired Retiming for Flip-flop Reduction --- p.26Chapter 4.1 --- Motivation and Problem Formulation --- p.26Chapter 4.2 --- Retiming Indication --- p.29Chapter 4.3 --- Target Wire Selection --- p.31Chapter 4.4 --- Incremental Placement Update --- p.33Chapter 4.5 --- Optimization Flow --- p.36Chapter 4.6 --- Experimental Results --- p.38Chapter 5 --- Power Analysis for Rewired Retiming --- p.41Chapter 5.1 --- Power Model --- p.41Chapter 5.2 --- Experimental Results --- p.44Chapter 6 --- Conclusion --- p.47Bibliography --- p.5

    Fast algorithms for retiming large digital circuits

    Get PDF
    The increasing complexity of VLSI systems and shrinking time to market requirements demand good optimization tools capable of handling large circuits. Retiming is a powerful transformation that preserves functionality, and can be used to optimize sequential circuits for a wide range of objective functions by judiciously relocating the memory elements. Leiserson and Saxe, who introduced the concept, presented algorithms for period optimization (minperiod retiming) and area optimization (minarea retiming). The ASTRA algorithm proposed an alternative view of retiming using the equivalence between retiming and clock skew optimization;The first part of this thesis defines the relationship between the Leiserson-Saxe and the ASTRA approaches and utilizes it for efficient minarea retiming of large circuits. The new algorithm, Minaret, uses the same linear program formulation as the Leiserson-Saxe approach. The underlying philosophy of the ASTRA approach is incorporated to reduce the number of variables and constraints in this linear program. This allows minarea retiming of circuits with over 56,000 gates in under fifteen minutes;The movement of flip-flops in control logic changes the state encoding of finite state machines, requiring the preservation of initial (reset) states. In the next part of this work the problem of minimizing the number of flip-flops in control logic subject to a specified clock period and with the guarantee of an equivalent initial state, is formulated as a mixed integer linear program. Bounds on the retiming variables are used to guarantee an equivalent initial state in the retimed circuit. These bounds lead to a simple method for calculating an equivalent initial state for the retimed circuit;The transparent nature of level sensitive latches enables level-clocked circuits to operate faster and require less area. However, this transparency makes the operation of level-clocked circuits very complex, and optimization of level-clocked circuits is a difficult task. This thesis also presents efficient algorithms for retiming large level-clocked circuits. The relationship between retiming and clock skew optimization for level-clocked circuits is defined and utilized to develop efficient retiming algorithms for period and area optimization. Using these algorithms a circuit with 56,000 gates could be retimed for minimum period in under twenty seconds and for minimum area in under 1.5 hours

    Computations of Uniform Recurrence Equations Using Minimal Memory Size

    Get PDF
    International audienceWe consider a system of uniform recurrence equations (URE) of dimension one. We show how its computation can be carried out using minimal memory size with several synchronous processors. This result is then applied to register minimization for digital circuits and parallel computation of task graphs

    Advanced Timing and Synchronization Methodologies for Digital VLSI Integrated Circuits

    Get PDF
    This dissertation addresses timing and synchronization methodologies that are critical to the design, analysis and optimization of high-performance, integrated digital VLSI systems. As process sizes shrink and design complexities increase, achieving timing closure for digital VLSI circuits becomes a significant bottleneck in the integrated circuit design flow. Circuit designers are motivated to investigate and employ alternative methods to satisfy the timing and physical design performance targets. Such novel methods for the timing and synchronization of complex circuitry are developed in this dissertation and analyzed for performance and applicability.Mainstream integrated circuit design flow is normally tuned for zero clock skew, edge-triggered circuit design. Non-zero clock skew or multi-phase clock synchronization is seldom used because the lack of design automation tools increases the length and cost of the design cycle. For similar reasons, level-sensitive registers have not become an industry standard despite their superior size, speed and power consumption characteristics compared to conventional edge-triggered flip-flops.In this dissertation, novel design and analysis techniques that fully automate the design and analysis of non-zero clock skew circuits are presented. Clock skew scheduling of both edge-triggered and level-sensitive circuits are investigated in order to exploit maximum circuit performances. The effects of multi-phase clocking on non-zero clock skew, level-sensitive circuits are investigated leading to advanced synchronization methodologies. Improvements in the scalability of the computational timing analysis process with clock skew scheduling are explored through partitioning and parallelization.The integration of the proposed design and analysis methods to the physical design flow of integrated circuits synchronized with a next-generation clocking technology-resonant rotary clocking technology-is also presented. Based on the design and analysis methods presented in this dissertation, a computer-aided design tool for the design of rotary clock synchronized integrated circuits is developed

    Conception d'un système de synthèse orienté-objet multiplateforme en vue d'une nouvelle méthode de synthèse

    Get PDF
    La loi de Moore prédit que le nombre de composants dans un circuit double tous les 18 mois. Cette augmentation permet de diminuer les délais dans ces composants, mais amènent une augmentation des délais liés aux interconnexions par rapport aux délais dans les composants et de la consommation de puissance. Récemment, les délais dans les interconnexions sont devenus trop importants par rapport aux délais dans les portes logiques au point où la méthode de synthèse automatisée de circuits intégrés actuelle est devenue inadéquate. Puisque le traitement des interconnexions s'effectue lors de la synthèse physique, une nouvelle approche, inversant les étapes de la synthèse physique et de la synthèse logique, a été envisagée. La conception d'un système, utilisant un langage orienté-objet et offrant de la portabilité et une intégration de modules futurs, a été l'objet de cette recherche puisqu'un système utilisant un tel procédé n'a pas encore vu le jour. Une plate-forme de synthèse a été développée et celle-ci a été testée à l'aide d'un module de gestion de budgets de délai. Premièrement, une lecture de la description logique du circuit provenant de la synthèse comportementale a été effectuée en utilisant un décomposeur analytique et un analyseur syntaxique. Ensuite, pendant cette lecture, un réseau booléen hiérarchique représentant le circuit a été bâti selon une infrastructure prédéfinie. Afin de pouvoir tester la plate-forme, des budgets de délais ont été assignés à chaque noeud du réseau en propageant le temps d'arrivée et le temps requis dans un circuit provenant d'une description logique hiérarchique complexe. Finalement, la gestion de budgets de délai a été faite par un algorithme conçu à cet effet et les résultats de celle-ci ont été analysés. Le résultat obtenu est une plate-forme de synthèse capable de faire de la gestion de budget de délais sur les chemins critiques dans un circuit donné. De plus, celle-ci pourra être utilisée de nouveau pour d'autres projets liés à la synthèse de circuits. La pertinence de cette recherche repose sur la résolution d'un problème grandissant dans le monde de la synthèse automatisée des circuits intégrés

    Efficient Implementation of Retiming

    No full text
    Retiming is a technique for optimizing sequential circuits. It repositions the registers in a circuit leaving the combinational cells untouched. The objective of retiming is to find a circuit with the minimum number of registers for a specified clock period. More than ten years have elapsed since Leiserson and Saxe rst presented a theoretical formulation to solve this problem for single-clock edge-triggered sequential circuits. Their proposed algorithms have polynomial complexity; however naive implementations of these algorithms exhibit O(n 3 ) time complexity and O(n 2 ) space complexity when applied to digital circuits with n combinational cells. This renders retiming ineffective for circuits with more than 500 combinational cells. This paper addresses the implementation issues required to exploit the sparsity of circuit graphs to allow min-period retiming and constrained min-area retiming to be applied to circuits with as many as 10,000 combinational cells. We believe this is the first paper to address these issues and the first to report retiming results for large circuits
    corecore