526 research outputs found

    DFT Techniques and Automation for Asynchronous NULL Conventional Logic Circuits

    Get PDF
    Conventional automatic test pattern generation (ATPG) algorithms fail when applied to asynchronous NULL convention logic (NCL) circuits due to the absence of a global clock and presence of more state-holding elements, leading to poor fault coverage. This paper presents a design-for-test (DFT) approach aimed at making asynchronous NCL designs testable using conventional ATPG programs. We propose an automatic DFT insertion flow (ADIF) methodology that performs scan and test point insertion on NCL designs to improve test coverage, using a custom ATPG library. Experimental results show significant increase in fault coverage for NCL cyclic and acyclic pipelined designs

    Asynchronous Circuit Synthesis Using Multi-Threshold NULL Convention Logic

    Get PDF
    As the demand for an energy-efficient alternative to traditional synchronous circuit design grows, hardware designers must reconsider the traditional clock tree. By doing away with the constrains of a clock, asynchronous sequential circuit designs can achieve a much greater level of efficiency. The utilization of asynchronous logic synthesis flows has enabled researchers to better implement asynchronous circuit designs which have been optimized using the same industry standard tools that are already used in sequential synchronous designs. This thesis offers a new flow for such tools which implements the MTNCL asynchronous circuit architecture

    CAD Tool Design for NCL and MTNCL Asynchronous Circuits

    Get PDF
    This thesis presents an implementation of a method developed to readily convert Boolean designs into an ultra-low power asynchronous design methodology called MTNCL, which combines multi-threshold CMOS (MTCMOS) with NULL Convention Logic (NCL) systems. MTNCL provides the leakage power advantages of an all high-Vt implementation with a reasonable speed penalty compared to the all low-Vt implementation, and has negligible area overhead. The proposed tool utilizes industry-standard CAD tools. This research also presents an Automated Gate-Level Pipelining with Bit-Wise Completion (AGLPBW) method to maximize throughput of delay-insensitive full-word pipelined NCL circuits. These methods have been integrated into the Mentor Graphics and Synopsis CAD tools, using a C-program, which performs the majority of the computations, such that the method can be easily ported to other CAD tool suites. Both methods have been successfully tested on circuits, including a 4-bit × 4-bit multiplier, an unsigned Booth2 multiplier, and a 4-bit/8-operation arithmetic logic unit (ALU

    Correct-by-construction microarchitectural pipelining

    Get PDF
    This paper presents a method for correct-by-construction microarchitectural pipelining that handles cyclic systems with dependencies between iterations. Our method combines previously known bypass and retiming transformations with a few transformations valid only for elastic systems with early evaluation (namely, empty FIFO insertion, FIFO capacity sizing, insertion of anti-tokens, and introducing early evaluation multiplexors). By converting the design to a synchronous elastic form and then applying this extended set of transformations, one can pipeline a functional specification with an automatically generated distributed controller that implements stalling logic resolving data hazards off the critical path of the design. We have developed an interactive toolkit for exploring elastic microarchitectural transformations. The method is illustrated by pipelining a few simple examples of instruction set architecture ISA specifications.Peer ReviewedPostprint (published version

    VeriSFQ - A Semi-formal Verification Framework and Benchmark for Single Flux Quantum Technology

    Get PDF
    In this paper, we propose a semi-formal verification framework for single-flux quantum (SFQ) circuits called VeriSFQ, using the Universal Verification Methodology (UVM) standard. The considered SFQ technology is superconducting digital electronic devices that operate at cryogenic temperatures with active circuit elements called the Josephson junction, which operate at high switching speeds and low switching energy - allowing SFQ circuits to operate at frequencies over 300 gigahertz. Due to key differences between SFQ and CMOS logic, verification techniques for the former are not as advanced as the latter. Thus, it is crucial to develop efficient verification techniques as the complexity of SFQ circuits scales. The VeriSFQ framework focuses on verifying the key circuit and gate-level properties of SFQ logic: fanout, gate-level pipeline, path balancing, and input-to-output latency. The combinational circuits considered in analyzing the performance of VeriSFQ are: Kogge-Stone adders (KSA), array multipliers, integer dividers, and select ISCAS'85 combinational benchmark circuits. Methods of introducing bugs into SFQ circuit designs for verification detection were experimented with - including stuck-at faults, fanout errors, unbalanced paths, and functional bugs like incorrect logic gates. In addition, we propose an SFQ verification benchmark consisting of combinational SFQ circuits that exemplify SFQ logic properties and present the performance of the VeriSFQ framework on these benchmark circuits. The portability and reusability of the UVM standard allows the VeriSFQ framework to serve as a foundation for future SFQ semi-formal verification techniques.Comment: 7 pages, 6 figures, 4 tables; submitted, accepted, and presented at ISQED 2019 (20th International Symposium on Quality Electronic Design) on March 7th, 2019 in Santa Clara, CA, US

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    A low-power quadrature digital modulator in 0.18um CMOS

    Get PDF
    Quadrature digital modulation techniques are widely used in modern communication systems because of their high performance and flexibility. However, these advantages come at the cost of high power consumption. As a result, power consumption has to be taken into account as a main design factor of the modulator.In this thesis, a low-power quadrature digital modulator in 0.18um CMOS is presented with the target system clock speed of 150 MHz. The quadrature digital modulator consists of several key blocks: quadrature direct digital synthesizer (QDDS), pulse shaping filter, interpolation filter and inverse sinc filter. The design strategy is to investigate different implementations for each block and compare the power consumption of these implementations. Based on the comparison results, the implementation that consumes the lowest power will be chosen for each block. First of all, a novel low-power QDDS is proposed in the thesis. Power consumption estimation shows that it can save up to 60% of the power consumption at 150 MHz system clock frequency compared with one conventional design. Power consumption estimation results also show that using two pulse shaping blocks to process I/Q data, cascaded integrator comb (CIC) interpolation structure, and inverse sinc filter with modified canonic signed digit (MCSD) multiplication consume less power than alternative design choices. These low-power blocks are integrated together to achieve a low-power modulator. The power consumption estimation after layout shows that it only consumes about 95 mW at 150 MHz system clock rate, which is much lower than similar commercial products. The designed modulator can provide a low-power solution for various quadrature modulators. It also has an output bandwidth from 0 to 75 MHz, configurable pulse shaping filters and interpolation filters, and an internal sin(x)/x correction filter
    • …
    corecore