3,551 research outputs found

    Timed circuits: a new paradigm for high-speed design

    Get PDF
    Journal ArticleAbstract| In order to continue to produce circuits of increasing speeds, designers must consider aggressive circuit design styles such as self-resetting or delayed-reset domino circuits used in IBM's gigahertz processor (GUTS) and asynchronous circuits used in Intel's RAPPID instruction length decoder. These new timed circuit styles, however, cannot be efficiently and accurately analyzed using traditional static timing analysis methods. This lack of efficient analysis tools is one of the reasons for the lack of mainstream acceptance of these design styles. This paper discusses several industrial timed circuits and gives an overview of our timed circuit design methodology

    Verification of delayed-reset domino circuits using ATACS

    Get PDF
    Journal ArticleThis paper discusses the application of the timing analysis tool ATACS to the high performance, self-resetting and delayed-reset domino circuits being designed at IBM's Austin Research Laboratory. The tool, which was originally developed to deal with asynchronous circuits, is well suited to the self-resetting style since internally, a block of selfresetting or delayed-reset domino logic is asynchronous. The circuits are represented using timed event/level structures. These structures correspond very directly to gate level circuits, making the translation from a transistor schematic to a TEL structure straightforward. The statespace explosion problem is mitigated using an algorithm based on partially ordered sets (POSETs). Results on a number of circuits from the recently published guTS (gigahertz unit Test Site) processor from IBM indicate that modules of significant size can be verified with ATACS using a level of abstraction that preserves the interesting timing properties of the circuit. Accurate circuit level verification allows the designer to include less margin in the design, which can lead to increased performance

    A micropower centroiding vision processor

    Get PDF
    Published versio

    Sensitivity analysis of oscillator models in the space of phase-response curves: Oscillators as open systems

    Full text link
    Oscillator models are central to the study of system properties such as entrainment or synchronization. Due to their nonlinear nature, few system-theoretic tools exist to analyze those models. The paper develops a sensitivity analysis for phase-response curves, a fundamental one-dimensional phase reduction of oscillator models. The proposed theoretical and numerical analysis tools are illustrated on several system-theoretic questions and models arising in the biology of cellular rhythms

    An asynchronous instruction length decoder

    Get PDF
    Journal ArticleAbstract-This paper describes an investigation of potential advantages and pitfalls of applying an asynchronous design methodology to an advanced microprocessor architecture. A prototype complex instruction set length decoding and steering unit was implemented using self-timed circuits. [The Revolving Asynchronous Pentium® Processor Instruction Decoder (RAPPID) design implemented the complete Pentium II® 32-bit MMX instruction set.] The prototype chip was fabricated on a 0.25-CMOS process and tested successfully. Results show significant advantages-in particular, performance of 2.5-4.5 instructions per nanosecond-with manageable risks using this design technology. The prototype achieves three times the throughput and half the latency, dissipating only half the power and requiring about the same area as the fastest commercial 400-MHz clocked circuit fabricated on the same process

    Area/latency optimized early output asynchronous full adders and relative-timed ripple carry adders

    Get PDF
    This article presents two area/latency optimized gate level asynchronous full adder designs which correspond to early output logic. The proposed full adders are constructed using the delay-insensitive dual-rail code and adhere to the four-phase return-to-zero handshaking. For an asynchronous ripple carry adder (RCA) constructed using the proposed early output full adders, the relative-timing assumption becomes necessary and the inherent advantages of the relative-timed RCA are: (1) computation with valid inputs, i.e., forward latency is data-dependent, and (2) computation with spacer inputs involves a bare minimum constant reverse latency of just one full adder delay, thus resulting in the optimal cycle time. With respect to different 32-bit RCA implementations, and in comparison with the optimized strong-indication, weak-indication, and early output full adder designs, one of the proposed early output full adders achieves respective reductions in latency by 67.8, 12.3 and 6.1 %, while the other proposed early output full adder achieves corresponding reductions in area by 32.6, 24.6 and 6.9 %, with practically no power penalty. Further, the proposed early output full adders based asynchronous RCAs enable minimum reductions in cycle time by 83.4, 15, and 8.8 % when considering carry-propagation over the entire RCA width of 32-bits, and maximum reductions in cycle time by 97.5, 27.4, and 22.4 % for the consideration of a typical carry chain length of 4 full adder stages, when compared to the least of the cycle time estimates of various strong-indication, weak-indication, and early output asynchronous RCAs of similar size. All the asynchronous full adders and RCAs were realized using standard cells in a semi-custom design fashion based on a 32/28 nm CMOS process technology

    An asynchronous instruction length decoder

    Get PDF
    Journal ArticleThis paper describes an investigation of potential advantages and pitfalls of applying an asynchronous design methodology to an advanced microprocessor architecture. A prototype complex instruction set length decoding and steering unit was implemented using self-timed circuits. [The Revolving Asynchronous Pentium® Processor Instruction Decoder (RAPPID) design implemented the complete Pentium II® 32-bit MMX instruction set.] The prototype chip was fabricated on a 0.25-CMOS process and tested successfully. Results show significant advantages-in particular, performance of 2.5-4.5 instructions per nanosecond-with manageable risks using this design technology. The prototype achieves three times the throughput and half the latency, dissipating only half the power and requiring about the same area as the fastest commercial 400-MHz clocked circuit fabricated on the same process

    RAPPID: an asynchronous instruction length decoder

    Get PDF
    Journal ArticleThis paper describes an investigation of potential advantages and risks of applying an aggressive asynchronous design methodology to Intel Architecture. RAPPID ("Revolving Asynchronous Pentium® Processor Instruction Decoder"), a prototype IA32 instruction length decoding and steering unit, was implemented using self-timed techniques. RAPPID chip was fabricated on a 0.25m CMOS process and tested successfully. Results show significant advantages-in particular, performance of 2.5-4.5 instructions/nS-with manageable risks using this design technology. RAPPID achieves three times the throughput and half the latency, dissipating only half the power and requiring about the same area as an existing 400MHz clocked circuit

    Asynchronous techniques for system-on-chip design

    Get PDF
    SoC design will require asynchronous techniques as the large parameter variations across the chip will make it impossible to control delays in clock networks and other global signals efficiently. Initially, SoCs will be globally asynchronous and locally synchronous (GALS). But the complexity of the numerous asynchronous/synchronous interfaces required in a GALS will eventually lead to entirely asynchronous solutions. This paper introduces the main design principles, methods, and building blocks for asynchronous VLSI systems, with an emphasis on communication and synchronization. Asynchronous circuits with the only delay assumption of isochronic forks are called quasi-delay-insensitive (QDI). QDI is used in the paper as the basis for asynchronous logic. The paper discusses asynchronous handshake protocols for communication and the notion of validity/neutrality tests, and completion tree. Basic building blocks for sequencing, storage, function evaluation, and buses are described, and two alternative methods for the implementation of an arbitrary computation are explained. Issues of arbitration, and synchronization play an important role in complex distributed systems and especially in GALS. The two main asynchronous/synchronous interfaces needed in GALS-one based on synchronizer, the other on stoppable clock-are described and analyzed

    RAPPID: an asynchronous instruction length decoder

    Get PDF
    Journal ArticleThis paper describes an investigation of potential advantages and risks of applying an aggressive asynchronous design methodology to Intel Architecture. RAPPID ("Revolving Asynchronous Pentium® Processor Instruction Decoder"), a prototype IA32 instruction length decoding and steering unit, was implemented using self-timed techniques. RAPPID chip was fabricated on a 0.25m CMOS process and tested successfully. Results show significant advantages-in particular, performance of 2.5-4.5 instructions/nS-with manageable risks using this design technology. RAPPID achieves three times the throughput and half the latency, dissipating only half the power and requiring about the same area as an existing 400MHz clocked circuit
    • …
    corecore