314 research outputs found

    Linearization of The Timing Analysis and Optimization of Level-Sensitive Circuits

    Get PDF
    This thesis describes a linear programming (LP) formulation applicable to the static timing analysis of large scale synchronous circuits with level-sensitive latches. The automatic timing analysis procedure presented here is composed of deriving the connectivity information, constructing the LP model and solving the clock period minimization problem of synchronous digital VLSI circuits. In synchronous circuits with level-sensitive latches, operation at a reduced clock period (higher clock frequency) is possible by takingadvantage of both non-zero clock skew scheduling and time borrowing. Clock skew schedulingis performed in order to exploit the benefits of nonidentical clock signal delays on circuit timing. The time borrowing property of level-sensitive circuits permits higher operating frequencies compared to edge-sensitivecircuits. Considering time borrowing in the timing analysis, however, introduces non-linearity in this timing analysis. The modified big M (MBM) method is defined in order to transform the non-linear constraints arising in the problem formulation into solvable linear constraints. Equivalent LP model problemsfor single-phase clock synchronization of the ISCAS'89 benchmark circuits are generated and these problems are solved by the industrial LP solver CPLEX. Through the simultaneous application of time borrowing and clock skew scheduling, up to 63% improvements are demonstrated in minimum clock period with respect to zero-skew edge-sensitive synchronous circuits. The timing constraints governing thelevel-sensitive synchronous circuit operation not only solve the clock period minimization problem but also provide a common framework for the general timing analysis of such circuits. The inclusion of additional constraints into the problem formulation in order to meet the timing requirements imposed by specific applicationenvironments is discussed

    Advanced Timing and Synchronization Methodologies for Digital VLSI Integrated Circuits

    Get PDF
    This dissertation addresses timing and synchronization methodologies that are critical to the design, analysis and optimization of high-performance, integrated digital VLSI systems. As process sizes shrink and design complexities increase, achieving timing closure for digital VLSI circuits becomes a significant bottleneck in the integrated circuit design flow. Circuit designers are motivated to investigate and employ alternative methods to satisfy the timing and physical design performance targets. Such novel methods for the timing and synchronization of complex circuitry are developed in this dissertation and analyzed for performance and applicability.Mainstream integrated circuit design flow is normally tuned for zero clock skew, edge-triggered circuit design. Non-zero clock skew or multi-phase clock synchronization is seldom used because the lack of design automation tools increases the length and cost of the design cycle. For similar reasons, level-sensitive registers have not become an industry standard despite their superior size, speed and power consumption characteristics compared to conventional edge-triggered flip-flops.In this dissertation, novel design and analysis techniques that fully automate the design and analysis of non-zero clock skew circuits are presented. Clock skew scheduling of both edge-triggered and level-sensitive circuits are investigated in order to exploit maximum circuit performances. The effects of multi-phase clocking on non-zero clock skew, level-sensitive circuits are investigated leading to advanced synchronization methodologies. Improvements in the scalability of the computational timing analysis process with clock skew scheduling are explored through partitioning and parallelization.The integration of the proposed design and analysis methods to the physical design flow of integrated circuits synchronized with a next-generation clocking technology-resonant rotary clocking technology-is also presented. Based on the design and analysis methods presented in this dissertation, a computer-aided design tool for the design of rotary clock synchronized integrated circuits is developed

    Detection of feed-through faults in CMOS storage elements

    Get PDF
    In testing sequential circuits, internal faults in the storage elements (SE's) are sometimes modeled as stuck-at faults in the combinational circuits surrounding the SE. The detection of some transistor-level faults that cannot be modeled as stuck-at are considered. These feed-through faults cause the cell to become either data-feed-through, which makes the cell combinational, or clock-feed-through, which causes the clock signal or its complement to appear at the output. Under such faults, the cell does not function as a memory element. Here it is shown that such faults may or may not be detected depending on delays involved. Conditions under which race-ahead occurs are identified

    VLSI design methodology

    Get PDF

    Doctor of Philosophy

    Get PDF
    dissertationThe bandwidth requirement for each link on a network-on-chip (NoC) may differ based on topology and traffic properties of the IP cores. Available bandwidth on an asynchronous NoC link will also vary depending on the wire length between sender and receiver. This work explores the benefit to NoC performance, area, and energy when this property is used to optimize bandwidth on specific links based on its bandwidth required by a target SoC design. Three asynchronous routers were designed for implementing of asynchronous NoCs. Simple routing scheme and single-flit packet format lead to performance- and area-efficient router designs. Their performance was evaluated in consideration of link wire delay. Comprehensive analysis of pipeline latch insertion in asynchronous communication links is performed in regard to link bandwidth. Optimal placement of pipeline latch for maximizing benefit to increase of bandwidth is described. Specific methods are proposed for performance, area and energy optimization, respectively. Performance optimization is achieved by increasing bandwidth of high trafficked and high utilized links in an NoC, as inserting pipeline latches in those links. Through decrease of bandwidth of links with low traffic and low utilization by halving data-path width, reduction of wire area of an NoC is accomplished. Energy optimization is performed using wide spacing between wires in links with high energy consumption. An analytical model for asynchronous link bandwidth estimation is presented. It is utilized to deploy NoC optimization methods as identifying adequate links for each optimization method. Energy and latency characteristics of an asynchronous NoC are compared to a similarly-designed synchronous NoC. The results indicate that the asynchronous network has lower energy, and link-specific bandwidth optimization has improved NoC performance. Evaluation of proposed optimization methods by employing to an asynchronous NoC shows achievements of performance enhancement, wire area reduction and wire energy saving

    Energy-Efficient Digital Circuit Design using Threshold Logic Gates

    Get PDF
    abstract: Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have been no new automated design techniques, that can provide considerable improvements in circuit power, leakage and area. Although emerging nano-devices are expected to replace the existing MOSFET devices, they are far from being as mature as semiconductor devices and their full potential and promises are many years away from being practical. The research described in this dissertation consists of four main parts. First is a new circuit architecture of a differential threshold logic flipflop called PNAND. The PNAND gate is an edge-triggered multi-input sequential cell whose next state function is a threshold function of its inputs. Second a new approach, called hybridization, that replaces flipflops and parts of their logic cones with PNAND cells is described. The resulting \hybrid circuit, which consists of conventional logic cells and PNANDs, is shown to have significantly less power consumption, smaller area, less standby power and less power variation. Third, a new architecture of a field programmable array, called field programmable threshold logic array (FPTLA), in which the standard lookup table (LUT) is replaced by a PNAND is described. The FPTLA is shown to have as much as 50% lower energy-delay product compared to conventional FPGA using well known FPGA modeling tool called VPR. Fourth, a novel clock skewing technique that makes use of the completion detection feature of the differential mode flipflops is described. This clock skewing method improves the area and power of the ASIC circuits by increasing slack on timing paths. An additional advantage of this method is the elimination of hold time violation on given short paths. Several circuit design methodologies such as retiming and asynchronous circuit design can use the proposed threshold logic gate effectively. Therefore, the use of threshold logic flipflops in conventional design methodologies opens new avenues of research towards more energy-efficient circuits.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Single event upset hardened embedded domain specific reconfigurable architecture

    Get PDF

    Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling

    Get PDF
    Journal ArticleAs clock frequency increases and feature size decreases, clock distribution and wire delays present a growing challenge to the designers of singly-clocked, globally synchronous systems. We describe an alternative approach, which we call a Multiple Clock Domain (MCD) processor in which the chip is divided into several (coarse-grained) clock domains, within which independent voltage and frequency scaling can be performed. Boundaries between domains are chosen to exploit existing queues, thereby minimizing inter-domain synchronization costs. We propose four clock domains, corresponding to the front end (including LI instruction cache), integer units, floating point units, and load-store units (including Ll data cache and L2 cache). We evaluate this design using a simulation infrastructure based on SimpleScalar and Wattch. In an attempt to quantify potential energy savings independent of any particular on-line control strategy, we use of-line analysis of traces from a single-speed run of each of our benchmark applications to identify profitable reconfiguration points for a subsequent dynamic scaling run. Dynamic runs incorporate a detailed model of inter-domain synchronization delays, with latencies for intra-domain scaling similar to the whole-chip scaling latencies of Intel XScale and Transmeta LongRun technologies. Using applications from the MediaBench, Olden, and SPEC2000 benchmark suites, we obtain an average energy-delay product improvement of 20% with MCD compared to a modest 3% savings from voltage scaling a single clock and voltage system
    corecore