4 research outputs found

    Projeto e avaliação de células padrão para a concepção de circuitos integrados assíncronos em nanotecnologia de baixo consumo

    Get PDF
    A indústria de semicondutores inova sistematicamente com a criação de nanotecnologias de fabricação de circuitos integrados voltadas ao projeto de sistemas eletrônicos de baixo consumo. Este trabalho avalia a nova nanotecnologia 28 nm Fully-Depleted Silicon On Insulator (FD-SOI) aplicada na concepção de células padrão de sistemas assíncronos. Diferente das tecnologias convencionais, a tensão do substrato de um transistor FD-SOI é variável, permitindo através de mecanismos de polarização a gestão do consumo local de células do sistema. Complementarmente, os circuitos assíncronos apresentam-se como uma solução sólida para a redução do consumo dinâmico. Um estudo sobre a polarização de células básicas em tecnologia FD-SOI foi realizado neste trabalho visando sua aplicação na concepção de sistemas assíncronos de baixo consumo. Foi visto que o consumo pode ser reduzido em até 25 % unicamente com a polarização do circuito. A partir de opções de transistor em uma biblioteca standard-cell, obtem-se circuitos até duas vezes mais rápidos e um consumo de energia até 2,5 vezes menor. Foram analisadas duas arquiteruras para célula de polarização, possuindo uma diferença de 30 % em relação ao atraso. Para circuitos assíncronos C-Elements a diferença pode chergar a 29 % em velocidade e de 8,5 vezes em consumo de energia. Essas variações de performance e consumo demonstram a amplitude que se pode alcançar quando no emprego de tecnicas de controle de performace automáticos com uso de lógica de concepção assíncrona.The focus of this work is on the static power reduction for microelectronics systems. In one hand, the FD-SOI technology (Fully-Depleted Silicon on Insulator) allows the reduction of the static power consumption by biasing control. Complementarily, asynchronous circuits are known as an interesting solution for dynamic power saving. Both circuit design methodologies permits not just a gain on static and dynamic power consumption, but also open new prospects for devices layout and arrangement. This, in turn, provides new power saving techniques. Therefore, this work presents a study about the biasing effects on power saving using FD-SOI 28nm technology. Simulation results showed the possibility to reduce power consumption in 25 % thanks to the biasing scheme. A standard cell library offers a viriaty of transistors giving circuits 2 times faster and a range of approximatly 2.5 times in power consumption. Two architectures for automatic biasing circuits are presented, showing a difference of 30 % in delay and 2 times in power consumption. For the asynchronous CElements, 4 archtectures were compared showing a difference of 29 % in delay and nearly 8.5 in power consumption. Those ranges of performances and power consumptions justifys the use of automatic performance control using asynchronous circuits desing

    Null convention logic circuits for asynchronous computer architecture

    Get PDF
    For most of its history, computer architecture has been able to benefit from a rapid scaling in semiconductor technology, resulting in continuous improvements to CPU design. During that period, synchronous logic has dominated because of its inherent ease of design and abundant tools. However, with the scaling of semiconductor processes into deep sub-micron and then to nano-scale dimensions, computer architecture is hitting a number of roadblocks such as high power and increased process variability. Asynchronous techniques can potentially offer many advantages compared to conventional synchronous design, including average case vs. worse case performance, robustness in the face of process and operating point variability and the ready availability of high performance, fine grained pipeline architectures. Of the many alternative approaches to asynchronous design, Null Convention Logic (NCL) has the advantage that its quasi delay-insensitive behavior makes it relatively easy to set up complex circuits without the need for exhaustive timing analysis. This thesis examines the characteristics of an NCL based asynchronous RISC-V CPU and analyses the problems with applying NCL to CPU design. While a number of university and industry groups have previously developed small 8-bit microprocessor architectures using NCL techniques, it is still unclear whether these offer any real advantages over conventional synchronous design. A key objective of this work has been to analyse the impact of larger word widths and more complex architectures on NCL CPU implementations. The research commenced by re-evaluating existing techniques for implementing NCL on programmable devices such as FPGAs. The little work that has been undertaken previously on FPGA implementations of asynchronous logic has been inconclusive and seems to indicate that asynchronous systems cannot be easily implemented in these devices. However, most of this work related to an alternative technique called bundled data, which is not well suited to FPGA implementation because of the difficulty in controlling and matching delays in a 'bundle' of signals. On the other hand, this thesis clearly shows that such applications are not only possible with NCL, but there are some distinct advantages in being able to prototype complex asynchronous systems in a field-programmable technology such as the FPGA. A large part of the value of NCL derives from its architectural level behavior, inherent pipelining, and optimization opportunities such as the merging of register and combina- tional logic functions. In this work, a number of NCL multiplier architectures have been analyzed to reveal the performance trade-offs between various non-pipelined, 1D and 2D organizations. Two-dimensional pipelining can easily be applied to regular architectures such as array multipliers in a way that is both high performance and area-efficient. It was found that the performance of 2D pipelining for small networks such as multipliers is around 260% faster than the equivalent non-pipelined design. However, the design uses 265% more transistors so the methodology is mainly of benefit where performance is strongly favored over area. A pipelined 32bit x 32bit signed Baugh-Wooley multiplier with Wallace-Tree Carry Save Adders (CSA), which is representative of a real design used for CPUs and DSPs, was used to further explore this concept as it is faster and has fewer pipeline stages compared to the normal array multiplier using Ripple-Carry adders (RCA). It was found that 1D pipelining with ripple-carry chains is an efficient implementation option but becomes less so for larger multipliers, due to the completion logic for which the delay time depends largely on the number of bits involved in the completion network. The average-case performance of ripple-carry adders was explored using random input vectors and it was observed that it offers little advantage on the smaller multiplier blocks, but this particular timing characteristic of asynchronous design styles be- comes increasingly more important as word size grows. Finally, this research has resulted in the development of the first 32-Bit asynchronous RISC-V CPU core. Called the Redback RISC, the architecture is a structure of pipeline rings composed of computational oscillations linked with flow completeness relationships. It has been written using NELL, a commercial description/synthesis tool that outputs standard Verilog. The Redback has been analysed and compared to two approximately equivalent industry standard 32-Bit synchronous RISC-V cores (PicoRV32 and Rocket) that are already fabricated and used in industry. While the NCL implementation is larger than both commercial cores it has similar performance and lower power compared to the PicoRV32. The implementation results were also compared against an existing NCL design tool flow (UNCLE), which showed how much the results of these implementation strategies differ. The Redback RISC has achieved similar level of throughput and 43% better power and 34% better energy compared to one of the synchronous cores with the same benchmark test and test condition such as input sup- ply voltage. However, it was shown that area is the biggest drawback for NCL CPU design. The core is roughly 2.5× larger than synchronous designs. On the other hand its area is still 2.9× smaller than previous designs using UNCLE tools. The area penalty is largely due to the unavoidable translation into a dual-rail topology when using the standard NCL cell library

    Exploiting robustness in asynchronous circuits to design fine-tunable systems

    Get PDF
    PhD ThesisRobustness property in a circuit defines its tolerance to the effects of process, voltage and temperature variations. The mode signaling and event communication between computing units in a asynchronous circuits makes them inherently robust. The level of robustness depends on the type of delay assumptions used in the design and specification process. In this thesis, two approaches to exploiting robustness in asynchronous circuits to design self-adapting and fine-tunable systems are investigated. In the first investigation, a Digitally Controllable Oscillator (DCO) and a computing unit are integrated such that the operating conditions of the computing unit modulated the operation of the DCO. In this investigation, the computing unit which is a self-timed counter interacts with the DCO in a four-phase handshake protocol. This mode of interaction ensures a DCO and computing unit system that can fine-tune its operation to adapt to the effects of variations. In this investigation, it is shown that such a system will operate correctly in wide range of voltage supply. In the second investigation, a Digital Pulse-Width Modulator (DPWM) with coarse and fine-tune controls is designed using two Kessels counters. The coarse control of the DPWM tuned the pulse ratio and pulse frequency while the fine-tune control exploited the robustness property of asynchronous circuits in an addition-based delay system to add or subtract delay(s) to the pulse width while maintaining a constant pulse frequency. The DPWM realized gave constant duty ratio regardless of the operating voltage. This type of DPWM has practical application in a DC-DC converter circuit to tune the output voltage of the converter in high resolution. The Kessels counter is a loadable self-timed modulo−n counter, which is realized by decomposition using Horner’s method, specified and verified using formal asynchronous design techniques. The decomposition method used introduced parallelism in the system by dividing the counter into a systolic array of cells, with each cell further decomposed into two parts that have distinct defined operations. Specification of the decomposed counter cell parts operation was in three stages. The first stage employed high-level specification using Labelled Petri nets (LPN). In this form, functional correctness of the decomposed counter is modelled and verified. In the second stage, a cell part is specified by combing all possible operations for that cell part in high-level form. With this approach, a combination of inputs from a defined control block activated the correct operation for a cell part. In the final stage, the LPNs were converted to Signal Transition Graphs, from which the logic circuits of the cells were synthesized using the WorkCraft Tool. In this thesis, the Kessels counter was implemented and fabricated in 350 nm CMOS Technology.Niger Delta Development Commission (NDD

    Bibliography of Lewis Research Center technical publications announced in 1977

    Get PDF
    This compilation of abstracts describes and indexes over 780 technical reports resulting from the scientific and engineering work performed and managed by the Lewis Research Center in 1977. All the publications were announced in the 1977 issues of STAR (Scientific and Technical Aerospace Reports) and/or IAA (International Aerospace Abstracts). Documents cited include research reports, journal articles, conference presentations, patents and patent applications, and theses
    corecore