76 research outputs found

    Low-Power Soft-Error-Robust Embedded SRAM

    Get PDF
    Soft errors are radiation-induced ionization events (induced by energetic particles like alpha particles, cosmic neutron, etc.) that cause transient errors in integrated circuits. The circuit can always recover from such errors as the underlying semiconductor material is not damaged and hence, they are called soft errors. In nanometer technologies, the reduced node capacitance and supply voltage coupled with high packing density and lack of masking mechanisms are primarily responsible for the increased susceptibility of SRAMs towards soft errors. Coupled with these are the process variations (effective length, width, and threshold voltage), which are prominent in scaled-down technologies. Typically, SRAM constitutes up to 90% of the die in microprocessors and SoCs (System-on-Chip). Hence, the soft errors in SRAMs pose a potential threat to the reliable operation of the system. In this work, a soft-error-robust eight-transistor SRAM cell (8T) is proposed to establish a balance between low power consumption and soft error robustness. Using metrics like access time, leakage power, and sensitivity to single event transients (SET), the proposed approach is evaluated. For the purpose of analysis and comparisons the results of 8T cell are compared with a standard 6T SRAM cell and the state-of-the-art soft-error-robust SRAM cells. Based on simulation results in a 65-nm commercial CMOS process, the 8T cell demonstrates higher immunity to SETs along with smaller area and comparable leakage power. A 32-kb array of 8T cells was fabricated in silicon. After functional verification of the test chip, a radiation test was conducted to evaluate the soft error robustness. As SRAM cells are scaled aggressively to increase the overall packing density, the smaller transistors exhibit higher degrees of process variation and mismatch, leading to larger offset voltages. For SRAM sense amplifiers, higher offset voltages lead to an increased likelihood of an incorrect decision. To address this issue, a sense amplifier capable of cancelling the input offset voltage is presented. The simulated and measured results in 180-nm technology show that the sense amplifier is capable of detecting a 4 mV differential input signal under dc and transient conditions. The proposed sense amplifier, when compared with a conventional sense amplifier, has a similar die area and a greatly reduced offset voltage. Additionally, a dual-input sense amplifier architecture is proposed with corroborating silicon results to show that it requires smaller differential input to evaluate correctly.1 yea

    SRAM Read-Assist Scheme for Low Power High Performance Applications

    Get PDF
    Semiconductor technology scaling resulted in a considerable reduction in the transistor cost and an astonishing enhancement in the performance of VLSI (very large scale integration) systems. These nanoscale technologies have facilitated integration of large SRAMs which are now very popular for both processors and system-on-chip (SOC) designs. The density of SRAM array had a quadratic increase with each generation of CMOS technology. However, these nanoscale technologies unveiled few significant challenges to the design of high performance and low power embedded memories. First, process variation has become more significant in these technologies which threaten reliability of sensing circuitry. In order to alleviate this problem, we need to have larger signal swings on the bitlines (BLs) which degrade speed as well as power dissipation. The second challenge is due to the variation in the cell current which will reduce the worst case cell current. Since this cell current is responsible for discharging BLs, this problem will translate to longer activation time for the wordlines (WLs). The longer the WL pulse width is, the more likely is the cell to be unstable. A long WL pulse width can also degrade noise margin. Furthermore, as a result of continuous increase in the size of SRAMs, the BL capacitance has increased significantly which will deteriorate speed as well as power dissipation. The aforementioned problems require additional techniques and treatment such as read-assist techniques to insure fast, low power and reliable read operation in nanoscaled SRAMs. In this research we address these concerns and propose a read-assist sense amplifier (SA) in 65nm CMOS technology that expedites the process of developing differential voltage to be sensed by sense amplifier while reducing voltage swing on the BLs which will result in increased sensing speed, lower power and shorter WL activation time. A complete comparison is made between the proposed scheme, conventional SA and a state of the art design which shows speed improvement and power reduction of 56.1% and 25.9%, respectively over the conventional scheme at the expense of negligible area overhead. Also, the proposed scheme enables us to reduce cell VDD for having the same sensing speed which results in considerable reduction in leakage power dissipation

    Study of Single-Event Transient Effects on Analog Circuits

    Get PDF
    Radiation in space is potentially hazardous to microelectronic circuits and systems such as spacecraft electronics. Transient effects on circuits and systems from high energetic particles can interrupt electronics operation or crash the systems. This phenomenon is particularly serious in complementary metal-oxide-semiconductor (CMOS) integrated circuits (ICs) since most of modern ICs are implemented with CMOS technologies. The problem is getting worse with the technology scaling down. Radiation-hardening-by-design (RHBD) is a popular method to build CMOS devices and systems meeting performance criteria in radiation environment. Single-event transient (SET) effects in digital circuits have been studied extensively in the radiation effect community. In recent years analog RHBD has been received increasing attention since analog circuits start showing the vulnerability to the SETs due to the dramatic process scaling. Analog RHBD is still in the research stage. This study is to further study the effects of SET on analog CMOS circuits and introduces cost-effective RHBD approaches to mitigate these effects. The analog circuits concerned in this study include operational amplifiers (op amps), comparators, voltage-controlled oscillators (VCOs), and phase-locked loops (PLLs). Op amp is used to study SET effects on signal amplitude while the comparator, the VCO, and the PLL are used to study SET effects on signal state during transition time. In this work, approaches based on multi-level from transistor, circuit, to system are presented to mitigate the SET effects on the aforementioned circuits. Specifically, RHBD approach based on the circuit level, such as the op amp, adapts the auto-zeroing cancellation technique. The RHBD comparator implemented with dual-well and triple-well is studied and compared at the transistor level. SET effects are mitigated in a LC-tank oscillator by inserting a decoupling resistor. The RHBD PLL is implemented on the system level using triple modular redundancy (TMR) approach. It demonstrates that RHBD at multi-level can be cost-effective to mitigate the SEEs in analog circuits. In addition, SETs detection approaches are provided in this dissertation so that various mitigation approaches can be implemented more effectively. Performances and effectiveness of the proposed RHBD are validated through SPICE simulations on the schematic and pulsed-laser experiments on the fabricated circuits. The proposed and tested RHBD techniques can be applied to other relevant analog circuits in the industry to achieve radiation-tolerance

    Investigating Input Offset Reduction with Timing Manipulation in Low Voltage Sense Amplifiers

    Get PDF
    Static Random Access Memories (SRAMs) are ubiquitous in modern computer systems. They provide a fast and relatively compact method of data storage. SRAM cells are read from and written to using analog differential bitline signals, BL and BLB. To increase operating speed and conserve power during a read cycle, cell access time is limited to a short duration. Since SRAM are often implemented with near-minimum sized devices to maximize memory density, the devices are relatively weak and can only generate a limited differential voltage during this read window, typically between 10mV to 100mV. Standard logic devices cannot read this small signal, so sense amplifiers are used to rapidly amplify it to logic levels. A key metric for a sense amplifier’s performance is its input-referred offset voltage, . This dictates the minimum required input voltage to produce a correct decision. A lower means that a shorter read window for the SRAM is required, and the overall read cycle can be performed at a higher frequency. Unfortunately, with the trends of technology scaling, the effects of device mismatches from process variation are becoming more significant. In sense amplifiers, this device mismatch will create a statistical spread of with a mean and standard deviation of and . To guarantee error-free operation, a lower bound for input differential voltage is set by the worst-case scenario from this spread. Another difficulty introduced with modern trends is low voltage operation. The drive strengths of devices in lower VDD systems are weaker, so any imbalances due to threshold mismatch can become more significant compared to the nominal quantities. This thesis explores methods of reducing input offset voltage of low voltage SRAM sense amplifiers with a primary goal of reducing . A circuit called the Delayed PMOS VLSA, or DVLSA, is proposed. The DVLSA is based on the common VLSA and uses a timing manipulation technique with its control signals. The circuit design attempts to reduce by reducing the mismatch contribution of the PMOS pull-up pair. The circuit is tested at 0.4V with the VLSA used as a reference. Statistical simulations show that for the PMOS pull-up pair varying in isolation, the circuit works as intended and is reduced. When all differential devices are varied, the DVLSA has a larger . Investigating the source of the failure using the isolated variation of the other two device pairs shows that the timing manipulation technique has a negative impact on the NMOS pair. It also suggests that the use of the DLVSA architecture introduces additional covariances when all differential devices are varied

    Circuit Techniques for Adaptive and Reliable High Performance Computing.

    Full text link
    Increasing power density with process scaling has caused stagnation in the clock speed of modern microprocessors. Accordingly, designers have adopted message passing and shared memory based multicore architectures in order to keep up with the rapidly rising demand for computing throughput. At the same time, applications are not entirely parallel and improving single-thread performance continues to remain critical. Additionally, reliability is also worsening with process scaling, and margining for failures due to process and environmental variations in modern technologies consumes an increasingly large portion of the power/performance envelope. In the wake of multicore computing, reliability of signal synchronization between the cores is also becoming increasingly critical. This forces designers to search for alternate efficient methods to improve compute performance while addressing reliability. Accordingly, this dissertation presents innovative circuit and architectural techniques for variation-tolerance, performance and reliability targeted at datapath logic, signal synchronization and memories. Firstly, a domino logic based design style for datapath logic is presented that uses Adaptive Robustness Tuning (ART) in addition to timing speculation to provide up to 71% performance gains over conventional domino logic in 32bx32b multiplier in 65nm CMOS. Margins are reduced until functionality errors are detected, that are used to guide the tuning. Secondly, for signal synchronization across clock domains, a new class of dynamic logic based synchronizers with single-cycle synchronization latency is presented, where pulses, rather than stable intermediate voltages cause metastability. Such pulses are amplified using skewed inverters to improve mean time between failures by ~1e6x over jamb latches and double flip-flops at 2GHz in 65nm CMOS. Thirdly, a reconfigurable sensing scheme for 6T SRAMs is presented that employs auto-zero calibration and pre-amplification to improve sensing reliability (by up to 1.2 standard deviations of NMOS threshold voltage in 28nm CMOS); this increased reliability is in turn traded for ~42% sensing speedup. Finally, a main memory architecture design methodology to address reliability and power in the context of Exascale computing systems is presented. Based on 3D-stacked DRAMs, the methodology co-optimizes DRAM access energy, refresh power and the increased cost of error resilience, to meet stringent power and reliability constraints.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/107238/1/bharan_1.pd

    Low-power and high-performance SRAM design in high variability advanced CMOS technology

    Get PDF
    As process technologies shrink, the size and number of memories on a chip are exponentially increasing. Embedded SRAMs are a critical component in modern digital systems, and they strongly impact the overall power, performance, and area. To promote memory-related research in academia, this dissertation introduces OpenRAM, a flexible, portable and open-source memory compiler and characterization methodology for generating and verifying memory designs across different technologies.In addition, SRAM designs, focusing on improving power consumption, access time and bitcell stability are explored in high variability advanced CMOS technologies. To have a stable read/write operation for SRAM in high variability process nodes, a differential-ended single-port 8T bitcell is proposed that improves the read noise margin, write noise margin and readout bitcell current by 45%, 48% and 21%, respectively, compared to a conventional 6T bitcell. Also, a differential-ended single-port 12T bitcell for subthreshold operation is proposed that solves the half-select disturbance and allows efficient bit-interleaving. 12T bitcell has a leakage control mechanism which helps to reduce the power consumption and provides operation down to 0.3 V. Both 8T and 12T bitcells are analyzed in a 64 kb SRAM array using 32 nm technology. Besides, to further improve the access time and power consumption, two tracking circuits (multi replica bitline delay and reconfigurable replica bitline delay techniques) are proposed to aid the generation of accurate and optimum sense amplifier set time.An error tolerant SRAM architecture suitable for low voltage video application with dynamic power-quality management is also proposed in this dissertation. This memory uses three power supplies to improve the SRAM stability in low voltages. The proposed triple-supply approach achieves 63% improvement in image quality and 69% reduction in power consumption compared to a single-supply 64 kb SRAM array at 0.70 V

    Project and development of hardware accelerators for fast computing in multimedia processing

    Get PDF
    2017 - 2018The main aim of the present research work is to project and develop very large scale electronic integrated circuits, with particular attention to the ones devoted to image processing applications and the related topics. In particular, the candidate has mainly investigated four topics, detailed in the following. First, the candidate has developed a novel multiplier circuit capable of obtaining floating point (FP32) results, given as inputs an integer value from a fixed integer range and a set of fixed point (FI) values. The result has been accomplished exploiting a series of theorems and results on a number theory problem, known as Bachet’s problem, which allows the development of a new Distributed Arithmetic (DA) based on 3’s partitions. This kind of application results very fit for filtering applications working on an integer fixed input range, such in image processing applications, in which the pixels are coded on 8 bits per channel. In fact, in these applications the main problem is related to the high area and power consumption due to the presence of many Multiply and Accumulate (MAC) units, also compromising real-time requirements due to the complexity of FP32 operations. For these reasons, FI implementations are usually preferred, at the cost of lower accuracies. The results for the single multiplier and for a filter of dimensions 3x3 show respectively delay of 2.456 ns and 4.7 ns on FPGA platform and 2.18 ns and 4.426 ns on 90nm std_cell TSMC 90 nm implementation. Comparisons with state-of-the-art FP32 multipliers show a speed increase of up to 94.7% and an area reduction of 69.3% on FPGA platform. ... [edited by Author]XXXI cicl

    Review and Classification of Gain Cell eDRAM Implementations

    Get PDF
    With the increasing requirement of a high-density, high-performance, low power alternative to traditional SRAM, Gain Cell (GC) embedded DRAMs have gained a renewed interest in recent years. Several industrial and academic publications have presented GC memory implementations for various target applications, including high-performance processor caches, wireless communication memories, and biomedical system storage. In this paper, we review and compare the recent publications, examining the design requirements and the implementation techniques that lead to achievement of the required design metrics of these applications
    • …
    corecore