Search CORE

9 research outputs found

Design and Analysis of Robust Low Voltage Static Random Access Memories.

Author: Kim Daeyeon
Publication venue
Publication date
Field of study

Static Random Access Memory (SRAM) is an indispensable part of most modern VLSI designs and dominates silicon area in many applications. In scaled technologies, maintaining high SRAM yield becomes more challenging since they are particularly vulnerable to process variations due to 1) the minimum sized devices used in SRAM bitcells and 2) the large array sizes. At the same time, low power design is a key focus throughout the semiconductor industry. Since low voltage operation is one of the most effective ways to reduce power consumption due to its quadratic relationship to energy savings, lowering the minimum operating voltage (Vmin) of SRAM has gained significant interest. This thesis presents four different approaches to design and analyze robust low voltage SRAM: SRAM analysis method improvement, SRAM bitcell development, SRAM peripheral optimization, and advance device selection. We first describe a novel yield estimation method for bit-interleaved voltage-scaled 8-T SRAMs. Instead of the traditional trade-off between write and read, the trade-off between write and half select disturb is analyzed. In addition, this analysis proposes a method to find an appropriate Write Word-Line (WWL) pulse width to maximize yield. Second, low leakage 10-T SRAM with speed compensation scheme is proposed. During sleep mode of a sensor application, SRAM retaining data cannot be shut down so it is important to minimize leakage in SRAM. This work adopts several leakage reduction techniques while compensating performance. Third, adaptive write architecture for low voltage 8-T SRAMs is proposed. By adaptively modulating WWL width and voltage level, it is possible to achieve low power consumption while maintaining high yield without excessive performance degradation. Finally, low power circuit design based on heterojunction tunneling transistors (HETTs) is discussed. HETTs have a steep subthreshold swing beneficial for low voltage operation. Device modeling and design of logic and SRAM are proposed.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91569/1/daeyeonk_1.pd

Deep Blue Documents at the University of Michigan

Low-power and high-performance SRAM design in high variability advanced CMOS technology

Author: Ataei Fatemeh
Publication venue
Publication date: 01/05/2017
Field of study

As process technologies shrink, the size and number of memories on a chip are exponentially increasing. Embedded SRAMs are a critical component in modern digital systems, and they strongly impact the overall power, performance, and area. To promote memory-related research in academia, this dissertation introduces OpenRAM, a flexible, portable and open-source memory compiler and characterization methodology for generating and verifying memory designs across different technologies.In addition, SRAM designs, focusing on improving power consumption, access time and bitcell stability are explored in high variability advanced CMOS technologies. To have a stable read/write operation for SRAM in high variability process nodes, a differential-ended single-port 8T bitcell is proposed that improves the read noise margin, write noise margin and readout bitcell current by 45%, 48% and 21%, respectively, compared to a conventional 6T bitcell. Also, a differential-ended single-port 12T bitcell for subthreshold operation is proposed that solves the half-select disturbance and allows efficient bit-interleaving. 12T bitcell has a leakage control mechanism which helps to reduce the power consumption and provides operation down to 0.3 V. Both 8T and 12T bitcells are analyzed in a 64 kb SRAM array using 32 nm technology. Besides, to further improve the access time and power consumption, two tracking circuits (multi replica bitline delay and reconfigurable replica bitline delay techniques) are proposed to aid the generation of accurate and optimum sense amplifier set time.An error tolerant SRAM architecture suitable for low voltage video application with dynamic power-quality management is also proposed in this dissertation. This memory uses three power supplies to improve the SRAM stability in low voltages. The proposed triple-supply approach achieves 63% improvement in image quality and 69% reduction in power consumption compared to a single-supply 64 kb SRAM array at 0.70 V

SHAREOK repository

A Read-Decoupled Gated-Ground SRAM Architecture for Low-Power Embedded Memories

Author: Hussain Wasim
Publication venue
Publication date: 17/11/2011
Field of study

In order to meet the incessantly growing demand of performance, the amount of embedded or on-chip memory in microprocessors and systems-on-chip (SOC) is increasing. As much as 70% of the chip area is now dedicated to the embedded memory, which is primarily realized by the static random access memory (SRAM). Because of the large size of the SRAM, its yield and leakage power consumption dominate the overall yield and leakage power consumption of the chip. However, as the CMOS technology continues to scale in the sub-65 nanometer regime to reduce the transistor cost and the dynamic power, it poses a number of challenges on the SRAM design. In this thesis, we address these challenges and propose cell-level and architecture level solutions to increase the yield and reduce the leakage power consumption of the SRAM in nanoscale CMOS technologies. The conventional six transistor (6T) SRAM cell inherently suffers from a trade-off between the read stability and write-ability because of using the same bit line pair for both the read and write operations. An optimum design at a given process and voltage condition is a key to ensuring the yield and reliability of the SRAM. However, with technology scaling, process-induced variations in the transistor dimensions and electrical parameters coupled with variation in the operating conditions make it difficult to achieve a reasonably high yield. In this work, a gated SRAM architecture based on a seven transistor (7T) SRAM bit-cell is proposed to address these concerns. The proposed cell decouples the read bit line from the write bit lines. As a result, the storage node is not affected by any read induced noise during the read operation. Consequently, the proposed cell shows higher data stability and yield under varying process, voltage, and temperature (PVT) conditions. A single-ended sense amplifier is also presented to read from the proposed 7T cell while a unique write mechanism is used to reduce the write power to less than half of the write power of the conventional 6T cell. The proposed cell consumes similar silicon area and leakage power as the 6T cell when laid out and simulated using a commercial 65-nm CMOS technology. However, as much as 77% reduction in leakage power can be achieved by coupling the 7T cell with the column virtual grounding (CVG) technique, where a non-zero voltage is applied to the source terminals of driver NMOS transistors in the cell. The CVG technique also enables implementing multiple words per row, which is a key requirement for memories to avoid multiple-bit data upset in the event of radiation induced single event upset or soft error. In addition, the proposed cell inherently has a 30% larger soft error critical charge, making its soft error rate (SER) less than the half of that of the 6T cell

Concordia University Research Repository

TuRaN: True Random Number Generation Using Supply Voltage Underscaling in SRAMs

Author: Bostancı F. Nisa
Ergin Oğuz
Ghiasi Nika Mansouri
Mutlu Onur
Olgun Ataberk
Salami Behzad
Tuğrul Yahya Can
Yağlıkçı A. Giray
Yüksel İsmail Emir
Publication venue
Publication date: 20/11/2022
Field of study

Prior works propose SRAM-based TRNGs that extract entropy from SRAM arrays. SRAM arrays are widely used in a majority of specialized or general-purpose chips that perform the computation to store data inside the chip. Thus, SRAM-based TRNGs present a low-cost alternative to dedicated hardware TRNGs. However, existing SRAM-based TRNGs suffer from 1) low TRNG throughput, 2) high energy consumption, 3) high TRNG latency, and 4) the inability to generate true random numbers continuously, which limits the application space of SRAM-based TRNGs. Our goal in this paper is to design an SRAM-based TRNG that overcomes these four key limitations and thus, extends the application space of SRAM-based TRNGs. To this end, we propose TuRaN, a new high-throughput, energy-efficient, and low-latency SRAM-based TRNG that can sustain continuous operation. TuRaN leverages the key observation that accessing SRAM cells results in random access failures when the supply voltage is reduced below the manufacturer-recommended supply voltage. TuRaN generates random numbers at high throughput by repeatedly accessing SRAM cells with reduced supply voltage and post-processing the resulting random faults using the SHA-256 hash function. To demonstrate the feasibility of TuRaN, we conduct SPICE simulations on different process nodes and analyze the potential of access failure for use as an entropy source. We verify and support our simulation results by conducting real-world experiments on two commercial off-the-shelf FPGA boards. We evaluate the quality of the random numbers generated by TuRaN using the widely-adopted NIST standard randomness tests and observe that TuRaN passes all tests. TuRaN generates true random numbers with (i) an average (maximum) throughput of 1.6Gbps (1.812Gbps), (ii) 0.11nJ/bit energy consumption, and (iii) 278.46us latency

arXiv.org e-Print Archive

Repository for Publications and Research Data

U-DVS SRAM design considerations

Author: Sinangil Mahmut E. (Mahmut Ersin)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2008
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 73-78).With the continuous scaling down of transistor feature sizes, the semiconductor industry faces new challenges. One of these challenges is the incessant increase of power consumption in integrated circuits. This problem has motivated the industry and academia to pay significant attention to low-power circuit design for the past two decades. Operating digital circuits at lower voltage levels was shown to increase energy efficiency and lower power consumption. Being an integral part of the digital systems, Static Random Access Memories (SRAMs), dominate the power consumption and area of modern integrated circuits. Consequently, designing low-power high density SRAMs operational at low voltage levels is an important research problem. This thesis focuses on and makes several contributions to low-power SRAM design. The trade-offs and potential overheads associated with designing SRAMs for a very large voltage range are analyzed. An 8T SRAM cell is designed and optimized for both sub-threshold and above-threshold operation. Hardware reconfigurability is proposed as a solution to power and area overheads due to peripheral assist circuitry which are necessary for low voltage operation. A 64kbit SRAM has been designed in 65nm CMOS process and the fabricated chip has been tested, demonstrating operation at power supply levels from 0.25V to 1.2V. This is the largest operating voltage range reported in 65nm semiconductor technology node. Additionally, another low voltage SRAM has been designed for the on-chip caches of a low-power H.264 video decoder. Power and performance models of the memories have been developed along with a configurable interface circuit. This custom memory implemented with the low-power architecture of the decoder provides nearly 10X power savings.by Mahmut E. Sinangil.S.M

DSpace@MIT

Ultra-Low Power Circuit Design for Cubic-Millimeter Wireless Sensor Platform.

Author: Lee Yoonmyung
Publication venue
Publication date
Field of study

Modern daily life is surrounded by smaller and smaller computing devices. As Bell’s Law predicts, the research community is now looking at tiny computing platforms and mm3-scale sensor systems are drawing an increasing amount of attention since they can create a whole new computing environment. Designing mm3-scale sensor nodes raises various circuit and system level challenges and we have addressed and proposed novel solutions for many of these challenges to create the first complete 1.0mm3 sensor system including a commercial microprocessor. We demonstrate a 1.0mm3 form factor sensor whose modular die-stacked structure allows maximum volume utilization. Low power I2C communication enables inter-layer serial communication without losing compatibility to standard I2C communication protocol. A dual microprocessor enables concurrent computation for the sensor node control and measurement data processing. A multi-modal power management unit allowed energy harvesting from various harvesting sources. An optical communication scheme is provided for initial programming, synchronization and re-programming after recovery from battery discharge. Standby power reduction techniques are investigated and a super cut-off power gating scheme with an ultra-low power charge pump reduces the standby power of logic circuits by 2-19× and memory by 30%. Different approaches for designing low-power memory for mm3-scale sensor nodes are also presented in this work. A dual threshold voltage gain cell eDRAM design achieves the lowest eDRAM retention power and a 7T SRAM design based on hetero-junction tunneling transistors reduces the standby power of SRAM by 9-19× with only 15% area overhead. We have paid special attention to the timer for the mm3-scale sensor systems and propose a multi-stage gate-leakage-based timer to limit the standard deviation of the error in hourly measurement to 196ms and a temperature compensation scheme reduces temperature dependency to 31ppm/°C. These techniques for designing ultra-low power circuits for a mm3-scale sensor enable implementation of a 1.0mm3 sensor node, which can be used as a skeleton for future micro-sensor systems in variety of applications. These microsystems imply the continuation of the Bell’s Law, which also predicts the massive deployment of mm3-scale computing systems and emergence of even smaller and more powerful computing systems in the near future.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91438/1/sori_1.pd

Deep Blue Documents at the University of Michigan

Exploiting Natural On-chip Redundancy for Energy Efficient Memory and Computing

Author: Alastruey Benedé Jesús
Ferrerón Labari Alexandra
Suárez Gracia Darío
Publication venue: Universidad de Zaragoza, Prensas de la Universidad
Publication date: 01/01/2016
Field of study

Power density is currently the primary design constraint across most computing segments and the main performance limiting factor. For years, industry has kept power density constant, while increasing frequency, lowering transistors supply (Vdd) and threshold (Vth) voltages. However, Vth scaling has stopped because leakage current is exponentially related to it. Transistor count and integration density keep doubling every process generation (Moore’s Law), but the power budget caps the amount of hardware that can be active at the same time, leading to dark silicon. With each new generation, there are more resources available, but we cannot fully exploit their performance potential. In the last years, different research trends have explored how to cope with dark silicon and unlock the energy efficiency of the chips, including Near-Threshold voltage Computing (NTC) and approximate computing. NTC aggressively lowers Vdd to values near Vth. This allows a substantial reduction in power, as dynamic power scales quadratically with supply voltage. The resultant power reduction could be used to activate more chip resources and potentially achieve performance improvements. Unfortunately, Vdd scaling is limited by the tight functionality margins of on-chip SRAM transistors. When scaling Vdd down to values near-threshold, manufacture-induced parameter variations affect the functionality of SRAM cells, which eventually become not reliable. A large amount of emerging applications, on the other hand, features an intrinsic error-resilience property, tolerating a certain amount of noise. In this context, approximate computing takes advantage of this observation and exploits the gap between the level of accuracy required by the application and the level of accuracy given by the computation, providing that reducing the accuracy translates into an energy gain. However, deciding which instructions and data and which techniques are best suited for approximation still poses a major challenge. This dissertation contributes in these two directions. First, it proposes a new approach to mitigate the impact of SRAM failures due to parameter variation for effective operation at ultra-low voltages. We identify two levels of natural on-chip redundancy: cache level and content level. The first arises because of the replication of blocks in multi-level cache hierarchies. We exploit this redundancy with a cache management policy that allocates blocks to entries taking into account the nature of the cache entry and the use pattern of the block. This policy obtains performance improvements between 2% and 34%, with respect to block disabling, a technique with similar complexity, incurring no additional storage overhead. The latter (content level redundancy) arises because of the redundancy of data in real world applications. We exploit this redundancy compressing cache blocks to fit them in partially functional cache entries. At the cost of a slight overhead increase, we can obtain performance within 2% of that obtained when the cache is built with fault-free cells, even if more than 90% of the cache entries have at least a faulty cell. Then, we analyze how the intrinsic noise tolerance of emerging applications can be exploited to design an approximate Instruction Set Architecture (ISA). Exploiting the ISA redundancy, we explore a set of techniques to approximate the execution of instructions across a set of emerging applications, pointing out the potential of reducing the complexity of the ISA, and the trade-offs of the approach. In a proof-of-concept implementation, the ISA is shrunk in two dimensions: Breadth (i.e., simplifying instructions) and Depth (i.e., dropping instructions). This proof-of-concept shows that energy can be reduced on average 20.6% at around 14.9% accuracy loss

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Universidad de Zaragoza

WORKLOAD-ADAPTATION IN MEMORY CONTROLLERS

Author: Ghasempour Mohsen
Publication venue
Publication date: 01/08/2016
Field of study

The University of Manchester - Institutional Repository