16 research outputs found

    Embedding Logic and Non-volatile Devices in CMOS Digital Circuits for Improving Energy Efficiency

    Get PDF
    abstract: Static CMOS logic has remained the dominant design style of digital systems for more than four decades due to its robustness and near zero standby current. Static CMOS logic circuits consist of a network of combinational logic cells and clocked sequential elements, such as latches and flip-flops that are used for sequencing computations over time. The majority of the digital design techniques to reduce power, area, and leakage over the past four decades have focused almost entirely on optimizing the combinational logic. This work explores alternate architectures for the flip-flops for improving the overall circuit performance, power and area. It consists of three main sections. First, is the design of a multi-input configurable flip-flop structure with embedded logic. A conventional D-type flip-flop may be viewed as realizing an identity function, in which the output is simply the value of the input sampled at the clock edge. In contrast, the proposed multi-input flip-flop, named PNAND, can be configured to realize one of a family of Boolean functions called threshold functions. In essence, the PNAND is a circuit implementation of the well-known binary perceptron. Unlike other reconfigurable circuits, a PNAND can be configured by simply changing the assignment of signals to its inputs. Using a standard cell library of such gates, a technology mapping algorithm can be applied to transform a given netlist into one with an optimal mixture of conventional logic gates and threshold gates. This approach was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier in 65nm LP technology. Simulation and chip measurements show more than 30% improvement in dynamic power and more than 20% reduction in core area. The functional yield of the PNAND reduces with geometry and voltage scaling. The second part of this research investigates the use of two mechanisms to improve the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM devices for low voltage operation. The third part of this research focused on the design of flip-flops with non-volatile storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated with both conventional D-flipflop and the PNAND circuits to implement non-volatile logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of system locally when a power interruption occurs. However, manufacturing variations in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading to an overly pessimistic design and consequently, higher energy consumption. A detailed analysis of the design trade-offs in the driver circuitry for performing backup and restore, and a novel method to design the energy optimal driver for a given yield is presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented, in which the backup time is determined on a per-chip basis, resulting in minimizing the energy wastage and satisfying the yield constraint. To achieve a yield of 98%, the conventional approach would have to expend nearly 5X more energy than the minimum required, whereas the proposed tunable approach expends only 26% more energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are designed with the same backup and restore circuitry in 65nm technology. The embedded logic in NV-TLFF compensates performance overhead of NVL. This leads to the possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and- accumulate (MAC) unit is designed to demonstrate the performance benefits of the proposed architecture. Based on the results of HSPICE simulations, the MAC circuit with the proposed NV-TLFF cells is shown to consume at least 20% less power and area as compared to the circuit designed with conventional DFFs, without sacrificing any performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    BOOLEAN AND BRAIN-INSPIRED COMPUTING USING SPIN-TRANSFER TORQUE DEVICES

    Get PDF
    Several completely new approaches (such as spintronic, carbon nanotube, graphene, TFETs, etc.) to information processing and data storage technologies are emerging to address the time frame beyond current Complementary Metal-Oxide-Semiconductor (CMOS) roadmap. The high speed magnetization switching of a nano-magnet due to current induced spin-transfer torque (STT) have been demonstrated in recent experiments. Such STT devices can be explored in compact, low power memory and logic design. In order to truly leverage STT devices based computing, researchers require a re-think of circuit, architecture, and computing model, since the STT devices are unlikely to be drop-in replacements for CMOS. The potential of STT devices based computing will be best realized by considering new computing models that are inherently suited to the characteristics of STT devices, and new applications that are enabled by their unique capabilities, thereby attaining performance that CMOS cannot achieve. The goal of this research is to conduct synergistic exploration in architecture, circuit and device levels for Boolean and brain-inspired computing using nanoscale STT devices. Specifically, we first show that the non-volatile STT devices can be used in designing configurable Boolean logic blocks. We propose a spin-memristor threshold logic (SMTL) gate design, where memristive cross-bar array is used to perform current mode summation of binary inputs and the low power current mode spintronic threshold device carries out the energy efficient threshold operation. Next, for brain-inspired computing, we have exploited different spin-transfer torque device structures that can implement the hard-limiting and soft-limiting artificial neuron transfer functions respectively. We apply such STT based neuron (or ‘spin-neuron’) in various neural network architectures, such as hierarchical temporal memory and feed-forward neural network, for performing “human-like” cognitive computing, which show more than two orders of lower energy consumption compared to state of the art CMOS implementation. Finally, we show the dynamics of injection locked Spin Hall Effect Spin-Torque Oscillator (SHE-STO) cluster can be exploited as a robust multi-dimensional distance metric for associative computing, image/ video analysis, etc. Our simulation results show that the proposed system architecture with injection locked SHE-STOs and the associated CMOS interface circuits can be suitable for robust and energy efficient associative computing and pattern matching

    Energy-Efficient Digital Circuit Design using Threshold Logic Gates

    Get PDF
    abstract: Improving energy efficiency has always been the prime objective of the custom and automated digital circuit design techniques. As a result, a multitude of methods to reduce power without sacrificing performance have been proposed. However, as the field of design automation has matured over the last few decades, there have been no new automated design techniques, that can provide considerable improvements in circuit power, leakage and area. Although emerging nano-devices are expected to replace the existing MOSFET devices, they are far from being as mature as semiconductor devices and their full potential and promises are many years away from being practical. The research described in this dissertation consists of four main parts. First is a new circuit architecture of a differential threshold logic flipflop called PNAND. The PNAND gate is an edge-triggered multi-input sequential cell whose next state function is a threshold function of its inputs. Second a new approach, called hybridization, that replaces flipflops and parts of their logic cones with PNAND cells is described. The resulting \hybrid circuit, which consists of conventional logic cells and PNANDs, is shown to have significantly less power consumption, smaller area, less standby power and less power variation. Third, a new architecture of a field programmable array, called field programmable threshold logic array (FPTLA), in which the standard lookup table (LUT) is replaced by a PNAND is described. The FPTLA is shown to have as much as 50% lower energy-delay product compared to conventional FPGA using well known FPGA modeling tool called VPR. Fourth, a novel clock skewing technique that makes use of the completion detection feature of the differential mode flipflops is described. This clock skewing method improves the area and power of the ASIC circuits by increasing slack on timing paths. An additional advantage of this method is the elimination of hold time violation on given short paths. Several circuit design methodologies such as retiming and asynchronous circuit design can use the proposed threshold logic gate effectively. Therefore, the use of threshold logic flipflops in conventional design methodologies opens new avenues of research towards more energy-efficient circuits.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    Get PDF
    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    Dense implementations of binary cellular nonlinear networks : from CMOS to nanotechnology

    Get PDF
    This thesis deals with the design and hardware realization of the cellular neural/nonlinear network (CNN)-type processors operating on data in the form of black and white (B/W) images. The ultimate goal is to achieve a very compact yet versatile cell structure that would allow for building a network with a very large spatial resolution. It is very important to be able to implement an array with a great number of cells on a single die. Not only it improves the computational power of the processor, but it might be the enabling factor for new applications as well. Larger resolution can be achieved in two ways. First, the cell functionality and operating principles can be tailored to improve the layout compactness. The other option is to use more advanced fabrication technology – either a newer, further downscaled CMOS process or one of the emerging nanotechnologies. It can be beneficial to realize an array processor as two separate parts – one dedicated for gray-scale and the other for B/W image processing, as their designs can be optimized. For instance, an implementation of a CNN dedicated for B/W image processing can be significantly simplified. When working with binary images only, all coefficients in the template matrix can also be reduced to binary values. In this thesis, such a binary programming scheme is presented as a means to reduce the cell size as well as to provide the circuits composed of emerging nanodevices with an efficient programmability. Digital programming can be very fast and robust, and leads to very compact coefficient circuits. A test structure of a binary-programmable CNN has been designed and implemented with standard 0.18 ”m CMOS technology. A single cell occupies only 155 ”m2, which corresponds to a cell density of 6451 cells per square millimeter. A variety of templates have been tested and the measured chip performance is discussed. Since the minimum feature size of modern CMOS devices has already entered the nanometer scale, and the limitations of further scaling are projected to be reached within the next decade or so, more and more interest and research activity is attracted by nanotechnology. Investigation of the quantum physics phenomena and development of new devices and circuit concepts, which would allow to overcome the CMOS limitations, is becoming an increasingly important science. A single-electron tunneling (SET) transistor is one of the most attractive nanodevices. While relying on the Coulomb interactions, these devices can be connected directly with a wire or through a coupling capacitance. To develop suitable structures for implementing the binary programming scheme with capacitive couplings, the CNN cell based on the floating gate MOSFET (FG-MOSFET) has been designed. This approach can be considered as a step towards a programmable cell implementation with nanodevices. Capacitively coupled CNN has been simulated and the presented results confirm the proper operation. Therefore, the same circuit strategies have also been applied to the CNN cell designed for SET technology. The cell has been simulated to work well with the binary programming scheme applied. This versatile structure can be implemented either as a pure SET design or as a SET-FET hybrid. In addition to the designs mentioned above, a number of promising nanodevices and emerging circuit architectures are introduced.reviewe

    Cutting Edge Nanotechnology

    Get PDF
    The main purpose of this book is to describe important issues in various types of devices ranging from conventional transistors (opening chapters of the book) to molecular electronic devices whose fabrication and operation is discussed in the last few chapters of the book. As such, this book can serve as a guide for identifications of important areas of research in micro, nano and molecular electronics. We deeply acknowledge valuable contributions that each of the authors made in writing these excellent chapters

    Image Processing and Analysis for Preclinical and Clinical Applications

    Get PDF
    Radiomics is one of the most successful branches of research in the field of image processing and analysis, as it provides valuable quantitative information for the personalized medicine. It has the potential to discover features of the disease that cannot be appreciated with the naked eye in both preclinical and clinical studies. In general, all quantitative approaches based on biomedical images, such as positron emission tomography (PET), computed tomography (CT) and magnetic resonance imaging (MRI), have a positive clinical impact in the detection of biological processes and diseases as well as in predicting response to treatment. This Special Issue, “Image Processing and Analysis for Preclinical and Clinical Applications”, addresses some gaps in this field to improve the quality of research in the clinical and preclinical environment. It consists of fourteen peer-reviewed papers covering a range of topics and applications related to biomedical image processing and analysis
    corecore