17 research outputs found

    Low-power and high-performance SRAM design in high variability advanced CMOS technology

    Get PDF
    As process technologies shrink, the size and number of memories on a chip are exponentially increasing. Embedded SRAMs are a critical component in modern digital systems, and they strongly impact the overall power, performance, and area. To promote memory-related research in academia, this dissertation introduces OpenRAM, a flexible, portable and open-source memory compiler and characterization methodology for generating and verifying memory designs across different technologies.In addition, SRAM designs, focusing on improving power consumption, access time and bitcell stability are explored in high variability advanced CMOS technologies. To have a stable read/write operation for SRAM in high variability process nodes, a differential-ended single-port 8T bitcell is proposed that improves the read noise margin, write noise margin and readout bitcell current by 45%, 48% and 21%, respectively, compared to a conventional 6T bitcell. Also, a differential-ended single-port 12T bitcell for subthreshold operation is proposed that solves the half-select disturbance and allows efficient bit-interleaving. 12T bitcell has a leakage control mechanism which helps to reduce the power consumption and provides operation down to 0.3 V. Both 8T and 12T bitcells are analyzed in a 64 kb SRAM array using 32 nm technology. Besides, to further improve the access time and power consumption, two tracking circuits (multi replica bitline delay and reconfigurable replica bitline delay techniques) are proposed to aid the generation of accurate and optimum sense amplifier set time.An error tolerant SRAM architecture suitable for low voltage video application with dynamic power-quality management is also proposed in this dissertation. This memory uses three power supplies to improve the SRAM stability in low voltages. The proposed triple-supply approach achieves 63% improvement in image quality and 69% reduction in power consumption compared to a single-supply 64 kb SRAM array at 0.70 V

    Challenges and Directions for Low-Voltage SRAM

    Full text link

    Energy-Aware Data Movement In Non-Volatile Memory Hierarchies

    Get PDF
    While technology scaling enables increased density for memory cells, the intrinsic high leakage power of conventional CMOS technology and the demand for reduced energy consumption inspires the use of emerging technology alternatives such as eDRAM and Non-Volatile Memory (NVM) including STT-MRAM, PCM, and RRAM. The utilization of emerging technology in Last Level Cache (LLC) designs which occupies a signifcant fraction of total die area in Chip Multi Processors (CMPs) introduces new dimensions of vulnerability, energy consumption, and performance delivery. To be specific, a part of this research focuses on eDRAM Bit Upset Vulnerability Factor (BUVF) to assess vulnerable portion of the eDRAM refresh cycle where the critical charge varies depending on the write voltage, storage and bit-line capacitance. This dissertation broaden the study on vulnerability assessment of LLC through investigating the impact of Process Variations (PV) on narrow resistive sensing margins in high-density NVM arrays, including on-chip cache and primary memory. Large-latency and power-hungry Sense Amplifers (SAs) have been adapted to combat PV in the past. Herein, a novel approach is proposed to leverage the PV in NVM arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time. On the other hand, this dissertation investigates a novel technique to prioritize the service to 1) Extensive Read Reused Accessed blocks of the LLC that are silently dropped from higher levels of cache, and 2) the portion of the working set that may exhibit distant re-reference interval in L2. In particular, we develop a lightweight Multi-level Access History Profiler to effciently identify ERRA blocks through aggregating the LLC block addresses tagged with identical Most Signifcant Bits into a single entry. Experimental results indicate that the proposed technique can reduce the L2 read miss ratio by 51.7% on average across PARSEC and SPEC2006 workloads. In addition, this dissertation will broaden and apply advancements in theories of subspace recovery to pioneer computationally-aware in-situ operand reconstruction via the novel Logic In Interconnect (LI2) scheme. LI2 will be developed, validated, and re?ned both theoretically and experimentally to realize a radically different approach to post-Moore\u27s Law computing by leveraging low-rank matrices features offering data reconstruction instead of fetching data from main memory to reduce energy/latency cost per data movement. We propose LI2 enhancement to attain high performance delivery in the post-Moore\u27s Law era through equipping the contemporary micro-architecture design with a customized memory controller which orchestrates the memory request for fetching low-rank matrices to customized Fine Grain Reconfigurable Accelerator (FGRA) for reconstruction while the other memory requests are serviced as before. The goal of LI2 is to conquer the high latency/energy required to traverse main memory arrays in the case of LLC miss, by using in-situ construction of the requested data dealing with low-rank matrices. Thus, LI2 exchanges a high volume of data transfers with a novel lightweight reconstruction method under specific conditions using a cross-layer hardware/algorithm approach

    Designing energy-efficient sub-threshold logic circuits using equalization and non-volatile memory circuits using memristors

    Full text link
    The very large scale integration (VLSI) community has utilized aggressive complementary metal-oxide semiconductor (CMOS) technology scaling to meet the ever-increasing performance requirements of computing systems. However, as we enter the nanoscale regime, the prevalent process variation effects degrade the CMOS device reliability. Hence, it is increasingly essential to explore emerging technologies which are compatible with the conventional CMOS process for designing highly-dense memory/logic circuits. Memristor technology is being explored as a potential candidate in designing non-volatile memory arrays and logic circuits with high density, low latency and small energy consumption. In this thesis, we present the detailed functionality of multi-bit 1-Transistor 1-memRistor (1T1R) cell-based memory arrays. We present the performance and energy models for an individual 1T1R memory cell and the memory array as a whole. We have considered TiO2- and HfOx-based memristors, and for these technologies there is a sub-10% difference between energy and performance computed using our models and HSPICE simulations. Using a performance-driven design approach, the energy-optimized TiO2-based RRAM array consumes the least write energy (4.06 pJ/bit) and read energy (188 fJ/bit) when storing 3 bits/cell for 100 nsec write and 1 nsec read access times. Similarly, HfOx-based RRAM array consumes the least write energy (365 fJ/bit) and read energy (173 fJ/bit) when storing 3 bits/cell for 1 nsec write and 200 nsec read access times. On the logic side, we investigate the use of equalization techniques to improve the energy efficiency of digital sequential logic circuits in sub-threshold regime. We first propose the use of a variable threshold feedback equalizer circuit with combinational logic blocks to mitigate the timing errors in digital logic designed in sub-threshold regime. This mitigation of timing errors can be leveraged to reduce the dominant leakage energy by scaling supply voltage or decreasing the propagation delay. At the fixed supply voltage, we can decrease the propagation delay of the critical path in a combinational logic block using equalizer circuits and, correspondingly decrease the leakage energy consumption. For a 8-bit carry lookahead adder designed in UMC 130 nm process, the operating frequency can be increased by 22.87% (on average), while reducing the leakage energy by 22.6% (on average) in the sub-threshold regime. Overall, the feedback equalization technique provides up to 35.4% lower energy-delay product compared to the conventional non-equalized logic. We also propose a tunable adaptive feedback equalizer circuit that can be used with sequential digital logic to mitigate the process variation effects and reduce the dominant leakage energy component in sub-threshold digital logic circuits. For a 64-bit adder designed in 130 nm our proposed approach can reduce the normalized delay variation of the critical path delay from 16.1% to 11.4% while reducing the energy-delay product by 25.83% at minimum energy supply voltage. In addition, we present detailed energy-performance models of the adaptive feedback equalizer circuit. This work serves as a foundation for the design of robust, energy-efficient digital logic circuits in sub-threshold regime

    A fully integrated SRAM-based CMOS arbitrary waveform generator for analog signal processing

    Get PDF
    This dissertation focuses on design and implementation of a fully-integrated SRAM-based arbitrary waveform generator for analog signal processing applications in a CMOS technology. The dissertation consists of two parts: Firstly, a fully-integrated arbitrary waveform generator for a multi-resolution spectrum sensing of a cognitive radio applications, and an analog matched-filter for a radar application and secondly, low-power techniques for an arbitrary waveform generator. The fully-integrated low-power AWG is implemented and measured in a 0.18-„Ïm CMOS technology. Theoretical analysis is performed, and the perspective implementation issues are mentioned comparing the measurement results. Moreover, the low-power techniques of SRAM are addressed for the analog signal processing: Self-deactivated data-transition bit scheme, diode-connected low-swing signaling scheme with a short-current reduction buffer, and charge-recycling with a push-pull level converter for power reduction of asynchronous design. Especially, the robust latch-type sense amplifier using an adaptive-latch resistance and fully-gated ground 10T-SRAM bitcell in a 45-nm SOI technology would be used as a technique to overcome the challenges in the upcoming deep-submicron technologies.Ph.D.Committee Chair: Kim, Jongman; Committee Member: Kang, Sung Ha; Committee Member: Lee, Chang-Ho; Committee Member: Mukhopadhyay, Saibal; Committee Member: Tentzeris, Emmanouil

    Low-voltage embedded biomedical processor design

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 180-190).Advances in mobile electronics are fueling new possibilities in a variety of applications, one of which is ambulatory medical monitoring with body-worn or implanted sensors. Digital processors on such sensors serve to analyze signals in real-time and extract key features for transmission or storage. To support diverse and evolving applications, the processor should be flexible, and to extend sensor operating lifetime, the processor should be energy-efficient. This thesis focuses on architectures and circuits for low power biomedical signal processing. A general-purpose processor is extended with custom hardware accelerators to reduce the cycle count and energy for common tasks, including FIR and median filtering as well as computing FFTs and mathematical functions. Improvements to classic architectures are proposed to reduce power and improve versatility: an FFT accelerator demonstrates a new control scheme to reduce datapath switching activity, and a modified CORDIC engine features increased input range and decreased quantization error over conventional designs. At the system level, the addition of accelerators increases leakage power and bus loading; strategies to mitigate these costs are analyzed in this thesis. A key strategy for improving energy efficiency is to aggressively scale the power supply voltage according to application performance demands. However, increased sensitivity to variation at low voltages must be mitigated in logic and SRAM design. For logic circuits, a design flow and a hold time verification methodology addressing local variation are proposed and demonstrated in a 65nm microcontroller functioning at 0.3V. For SRAMs, a model for the weak-cell read current is presented for near-V supply voltages, and a self-timed scheme for reducing internal bus glitches is employed with low leakage overhead. The above techniques are demonstrated in a 0.5-1.OV biomedical signal processing platform in 0.13p-Lm CMOS. The use of accelerators for key signal processing enabled greater than 10x energy reduction in two complete EEG and EKG analysis applications, as compared to implementations on a conventional processor.by Joyce Y. S. Kwong.Ph.D

    U-DVS SRAM design considerations

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 73-78).With the continuous scaling down of transistor feature sizes, the semiconductor industry faces new challenges. One of these challenges is the incessant increase of power consumption in integrated circuits. This problem has motivated the industry and academia to pay significant attention to low-power circuit design for the past two decades. Operating digital circuits at lower voltage levels was shown to increase energy efficiency and lower power consumption. Being an integral part of the digital systems, Static Random Access Memories (SRAMs), dominate the power consumption and area of modern integrated circuits. Consequently, designing low-power high density SRAMs operational at low voltage levels is an important research problem. This thesis focuses on and makes several contributions to low-power SRAM design. The trade-offs and potential overheads associated with designing SRAMs for a very large voltage range are analyzed. An 8T SRAM cell is designed and optimized for both sub-threshold and above-threshold operation. Hardware reconfigurability is proposed as a solution to power and area overheads due to peripheral assist circuitry which are necessary for low voltage operation. A 64kbit SRAM has been designed in 65nm CMOS process and the fabricated chip has been tested, demonstrating operation at power supply levels from 0.25V to 1.2V. This is the largest operating voltage range reported in 65nm semiconductor technology node. Additionally, another low voltage SRAM has been designed for the on-chip caches of a low-power H.264 video decoder. Power and performance models of the memories have been developed along with a configurable interface circuit. This custom memory implemented with the low-power architecture of the decoder provides nearly 10X power savings.by Mahmut E. Sinangil.S.M

    Benchmarking of standard-cell based memories in the sub-VT domain in 65-nm CMOS technology

    Get PDF
    In this paper, standard-cell based memories (SCMs) are proposed as an alternative to full-custom sub-VT SRAM macros for ultra-low-power systems requiring small memory blocks. The energy per memory access as well as the maximum achievable throughput in the sub-VT domain of various SCM architectures are evaluated by means of a gate-level sub-VT characterization model, building on data extracted from fully placed, routed, and back-annotated netlists. The reliable operation at the energy-minimum voltage of the various SCM architectures in a 65-nm CMOS technology considering within-die process parameter variations is demonstrated by means of Monte Carlo circuit simulation. Finally, the energy per memory access, the achievable throughput, and the area of the best SCM architecture are compared to recent sub-VT SRAM designs

    Ultra Low Power Digital Circuit Design for Wireless Sensor Network Applications

    Get PDF
    Ny forskning innenfor feltet trĂ„dlĂžse sensornettverk Ă„pner for nye og innovative produkter og lĂžsninger. Biomedisinske anvendelser er blant omrĂ„dene med stĂžrst potensial og det investeres i dag betydelige belĂžp for Ă„ bruke denne teknologien for Ă„ gjĂžre medisinsk diagnostikk mer effektiv samtidig som man Ă„pner for fjerndiagnostikk basert pĂ„ trĂ„dlĂžse sensornoder integrert i et ”helsenett”. MĂ„let er Ă„ forbedre tjenestekvalitet og redusere kostnader samtidig som brukerne skal oppleve forbedret livskvalitet som fĂžlge av Ăžkt trygghet og mulighet for Ă„ tilbringe mest mulig tid i eget hjem og unngĂ„ unĂždvendige sykehusbesĂžk og innleggelser. For Ă„ gjĂžre dette til en realitet er man avhengige av sensorelektronikk som bruker minst mulig energi slik at man oppnĂ„r tilstrekkelig batterilevetid selv med veldig smĂ„ batterier. I sin avhandling ” Ultra Low power Digital Circuit Design for Wireless Sensor Network Applications” har PhD-kandidat Farshad Moradi fokusert pĂ„ nye lĂžsninger innenfor konstruksjon av energigjerrig digital kretselektronikk. Avhandlingen presenterer nye lĂžsninger bĂ„de innenfor aritmetiske og kombinatoriske kretser, samtidig som den studerer nye statiske minneelementer (SRAM) og alternative minnearkitekturer. Den ser ogsĂ„ pĂ„ utfordringene som oppstĂ„r nĂ„r silisiumteknologien nedskaleres i takt med mikroprosessorutviklingen og foreslĂ„r lĂžsninger som bidrar til Ă„ gjĂžre kretslĂžsninger mer robuste og skalerbare i forhold til denne utviklingen. De viktigste konklusjonene av arbeidet er at man ved Ă„ introdusere nye konstruksjonsteknikker bĂ„de er i stand til Ă„ redusere energiforbruket samtidig som robusthet og teknologiskalerbarhet Ăžker. Forskningen har vĂŠrt utfĂžrt i samarbeid med Purdue University og vĂŠrt finansiert av Norges ForskningsrĂ„d gjennom FRINATprosjektet ”Micropower Sensor Interface in Nanometer CMOS Technology”
    corecore