218 research outputs found

    On-Chip Transparent Wire Pipelining (invited paper)

    Get PDF
    Wire pipelining has been proposed as a viable mean to break the discrepancy between decreasing gate delays and increasing wire delays in deep-submicron technologies. Far from being a straightforwardly applicable technique, this methodology requires a number of design modifications in order to insert it seamlessly in the current design flow. In this paper we briefly survey the methods presented by other researchers in the field and then we thoroughly analyze the solutions we recently proposed, ranging from system-level wire pipelining to physical design aspects

    Placement driven retiming with a coupled edge timing model

    Get PDF
    Retiming is a widely investigated technique for performance optimization. It performs powerful modifications on a circuit netlist. However, often it is not clear, whether the predicted performance improvement will still be valid after placement has been performed. This paper presents a new retiming algorithm using a highly accurate timing model taking into account the effect of retiming on capacitive loads of single wires as well as fanout systems. We propose the integration of retiming into a timing-driven standard cell placement environment based on simulated annealing. Retiming is used as an optimization technique throughout the whole placement process. The experimental results show the benefit of the proposed approach. In comparison with the conventional design flow based on standard FEAS our approach achieved an improvement in cycle time of up to 34% and 17% on the average

    Fast algorithms for retiming large digital circuits

    Get PDF
    The increasing complexity of VLSI systems and shrinking time to market requirements demand good optimization tools capable of handling large circuits. Retiming is a powerful transformation that preserves functionality, and can be used to optimize sequential circuits for a wide range of objective functions by judiciously relocating the memory elements. Leiserson and Saxe, who introduced the concept, presented algorithms for period optimization (minperiod retiming) and area optimization (minarea retiming). The ASTRA algorithm proposed an alternative view of retiming using the equivalence between retiming and clock skew optimization;The first part of this thesis defines the relationship between the Leiserson-Saxe and the ASTRA approaches and utilizes it for efficient minarea retiming of large circuits. The new algorithm, Minaret, uses the same linear program formulation as the Leiserson-Saxe approach. The underlying philosophy of the ASTRA approach is incorporated to reduce the number of variables and constraints in this linear program. This allows minarea retiming of circuits with over 56,000 gates in under fifteen minutes;The movement of flip-flops in control logic changes the state encoding of finite state machines, requiring the preservation of initial (reset) states. In the next part of this work the problem of minimizing the number of flip-flops in control logic subject to a specified clock period and with the guarantee of an equivalent initial state, is formulated as a mixed integer linear program. Bounds on the retiming variables are used to guarantee an equivalent initial state in the retimed circuit. These bounds lead to a simple method for calculating an equivalent initial state for the retimed circuit;The transparent nature of level sensitive latches enables level-clocked circuits to operate faster and require less area. However, this transparency makes the operation of level-clocked circuits very complex, and optimization of level-clocked circuits is a difficult task. This thesis also presents efficient algorithms for retiming large level-clocked circuits. The relationship between retiming and clock skew optimization for level-clocked circuits is defined and utilized to develop efficient retiming algorithms for period and area optimization. Using these algorithms a circuit with 56,000 gates could be retimed for minimum period in under twenty seconds and for minimum area in under 1.5 hours

    Retiming with wire delay and post-retiming register placement.

    Get PDF
    Tong Ka Yau Dennis.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 77-81).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivations --- p.1Chapter 1.2 --- Progress on the Problem --- p.2Chapter 1.3 --- Our Contributions --- p.3Chapter 1.4 --- Thesis Organization --- p.4Chapter 2 --- Background on Retiming --- p.5Chapter 2.1 --- Introduction --- p.5Chapter 2.2 --- Preliminaries --- p.7Chapter 2.3 --- Retiming Problem --- p.9Chapter 3 --- Literature Review on Retiming --- p.10Chapter 3.1 --- Introduction --- p.10Chapter 3.2 --- The First Retiming Paper --- p.11Chapter 3.2.1 --- """Retiming Synchronous Circuitry""" --- p.11Chapter 3.3 --- Important Extensions of the Basic Retiming Algorithm --- p.14Chapter 3.3.1 --- """A Fresh Look at Retiming via Clock Skew Optimization""" --- p.14Chapter 3.3.2 --- """An Improved Algorithm for Minimum-Area Retiming""" --- p.16Chapter 3.3.3 --- """Efficient Implementation of Retiming""" --- p.17Chapter 3.4 --- Retiming in Physical Design Stages --- p.19Chapter 3.4.1 --- """Physical Planning with Retiming""" --- p.19Chapter 3.4.2 --- """Simultaneous Circuit Partitioning/Clustering with Re- timing for Performance Optimization" --- p.20Chapter 3.4.3 --- """Performance Driven Multi-level and Multiway Parti- tioning with Retiming" --- p.22Chapter 3.5 --- Retiming with More Sophisticated Timing Models --- p.23Chapter 3.5.1 --- """Retiming with Non-zero Clock Skew, Variable Register, and Interconnect Delay""" --- p.23Chapter 3.5.2 --- """Placement Driven Retiming with a Coupled Edge Tim- ing Model""" --- p.24Chapter 3.6 --- Post-Retiming Register Placement --- p.26Chapter 3.6.1 --- """Layout Driven Retiming Using the Coupled Edge Tim- ing Model""" --- p.26Chapter 3.6.2 --- """Integrating Logic Retiming and Register Placement""" --- p.27Chapter 4 --- Retiming with Gate and Wire Delay [2] --- p.29Chapter 4.1 --- Introduction --- p.29Chapter 4.2 --- Problem Formulation --- p.30Chapter 4.3 --- Optimal Approach [2] --- p.31Chapter 4.3.1 --- Original Mathematical Framework for Retiming --- p.31Chapter 4.3.2 --- A Modified Optimal Approach --- p.33Chapter 4.4 --- Near-Optimal Fast Approach [2] --- p.37Chapter 4.4.1 --- Considering Wire Delay Only --- p.38Chapter 4.4.2 --- Considering Both Gate and Wire Delay --- p.42Chapter 4.4.3 --- Computational Complexity --- p.43Chapter 4.4.4 --- Experimental Results --- p.44Chapter 4.5 --- Lin's Optimal Approach [23] --- p.47Chapter 4.5.1 --- Theoretical Results --- p.47Chapter 4.5.2 --- Algorithm Description --- p.51Chapter 4.5.3 --- Computational Complexity --- p.52Chapter 4.5.4 --- Experimental Results --- p.52Chapter 4.6 --- Summary --- p.54Chapter 5 --- Register Insertion in Placement [36] --- p.55Chapter 5.1 --- Introduction --- p.55Chapter 5.2 --- Problem Formulation --- p.57Chapter 5.3 --- Placement of Registers After Retiming --- p.60Chapter 5.3.1 --- Topology Finding --- p.60Chapter 5.3.2 --- Register Placement --- p.69Chapter 5.4 --- Experimental Results --- p.71Chapter 5.5 --- Summary --- p.74Chapter 6 --- Conclusion --- p.75Bibliography --- p.7

    Linearization of The Timing Analysis and Optimization of Level-Sensitive Circuits

    Get PDF
    This thesis describes a linear programming (LP) formulation applicable to the static timing analysis of large scale synchronous circuits with level-sensitive latches. The automatic timing analysis procedure presented here is composed of deriving the connectivity information, constructing the LP model and solving the clock period minimization problem of synchronous digital VLSI circuits. In synchronous circuits with level-sensitive latches, operation at a reduced clock period (higher clock frequency) is possible by takingadvantage of both non-zero clock skew scheduling and time borrowing. Clock skew schedulingis performed in order to exploit the benefits of nonidentical clock signal delays on circuit timing. The time borrowing property of level-sensitive circuits permits higher operating frequencies compared to edge-sensitivecircuits. Considering time borrowing in the timing analysis, however, introduces non-linearity in this timing analysis. The modified big M (MBM) method is defined in order to transform the non-linear constraints arising in the problem formulation into solvable linear constraints. Equivalent LP model problemsfor single-phase clock synchronization of the ISCAS'89 benchmark circuits are generated and these problems are solved by the industrial LP solver CPLEX. Through the simultaneous application of time borrowing and clock skew scheduling, up to 63% improvements are demonstrated in minimum clock period with respect to zero-skew edge-sensitive synchronous circuits. The timing constraints governing thelevel-sensitive synchronous circuit operation not only solve the clock period minimization problem but also provide a common framework for the general timing analysis of such circuits. The inclusion of additional constraints into the problem formulation in order to meet the timing requirements imposed by specific applicationenvironments is discussed

    Mix & Latch: An Optimization Flow for High-Performance Designs with Single-Clock Mixed-Polarity Latches and Flip-Flops

    Get PDF
    Flip-flops are the most used sequential elements in synchronous circuits, but designs based on latches can operate at higher frequencies and occupy less area. Techniques to increase the maximum operating frequency of flip-flop based designs, such as time-borrowing, rely on tight hold constraints that are difficult to satisfy using traditional back-end optimization techniques. We propose Mix & Latch , a methodology to increase the operating frequency of synchronous digital circuits using a single clock tree and a mixed distribution of positive- and negative-edge-triggered flops, and positive- and negative-level-sensitive latches. An efficient mathematical model is proposed to optimize the type and location of the sequential elements of the circuit. We ensure that the initial registers are not moved from their initial location, although they may change type, thus allowing the use of equivalence checking and static timing analysis to verify formally the correctness of the transformation. The technique is validated using a 28nm CMOS FDSOI technology, obtaining 1.33X post-layout average operating frequency improvement on a broad set of benchmarks over a standard commercial design flow. Additionally, the circuit area was also reduced by more than 1.19X on average for the same benchmarks, although the overall area reduction is not a goal of the optimization algorithm. To the best of our knowledge, this is the first work that proposes combining mixed-polarity flip-flops and latches to improve the circuit performance

    Energy-Aware Scheduling of FIR Filter Structures using a Timed Automata Model

    Get PDF
    corecore