163 research outputs found

    Exploiting Application Behaviors for Resilient Static Random Access Memory Arrays in the Near-Threshold Computing Regime

    Get PDF
    Near-Threshold Computing embodies an intriguing choice for mobile processors due to the promise of superior energy efficiency, extending the battery life of these devices while reducing the peak power draw. However, process, voltage, and temperature variations cause a significantly high failure rate of Level One cache cells in the near-threshold regime a stark contrast to designs in the super-threshold regime, where fault sites are rare. This thesis work shows that faulty cells in the near-threshold regime are highly clustered in certain regions of the cache. In addition, popular mobile benchmarks are studied to investigate the impact of run-time workloads on timing faults manifestation. A technique to mitigate the run-time faults is proposed. This scheme maps frequently used data to healthy cache regions by exploiting the application cache behaviors. The results show up to 78% gain in performance over two other state-of-the-art techniques

    Embracing Low-Power Systems with Improvement in Security and Energy-Efficiency

    Get PDF
    As the economies around the world are aligning more towards usage of computing systems, the global energy demand for computing is increasing rapidly. Additionally, the boom in AI based applications and services has already invited the pervasion of specialized computing hardware architectures for AI (accelerators). A big chunk of research in the industry and academia is being focused on providing energy efficiency to all kinds of power hungry computing architectures. This dissertation adds to these efforts. Aggressive voltage underscaling of chips is one the effective low power paradigms of providing energy efficiency. This dissertation identifies and deals with the reliability and performance problems associated with this paradigm and innovates novel energy efficient approaches. Specifically, the properties of a low power security primitive have been improved and, higher performance has been unlocked in an AI accelerator (Google TPU) in an aggressively voltage underscaled environment. And, novel power saving opportunities have been unlocked by characterizing the usage pattern of a baseline TPU with rigorous mathematical analysis

    Low Leakage and PDP Optimized FinFET based 8T SRAM Design

    Get PDF
    The paper proposes a Fin Field Effect Transistors (FinFETs) based SRAM design comprising of 8 transistors. The circuit utilizes channel length of 22 nanometers. The operation of this circuit is dependent upon the control switch CS that decides the operating mode and minimizes the leakage current flowing in the cell which in turn lowers the leakage power to a minimal value of 0.331pW. The read buffer available in the design provides a different path for read mode and also enhances the Read Static Noise Margin (RSNM), thus enhancing the readability of the circuit.This design is also able to operate at a minimal voltage of 70mV, thus efficiently utilizing the power available. It also optimizes the power delay product (PDP) for both read and write operations

    Embracing Visual Experience and Data Knowledge: Efficient Embedded Memory Design for Big Videos and Deep Learning

    Get PDF
    Energy efficient memory designs are becoming increasingly important, especially for applications related to mobile video technology and machine learning. The growing popularity of smart phones, tablets and other mobile devices has created an exponential demand for video applications in today?s society. When mobile devices display video, the embedded video memory within the device consumes a large amount of the total system power. This issue has created the need to introduce power-quality tradeoff techniques for enabling good quality video output, while simultaneously enabling power consumption reduction. Similarly, power efficiency issues have arisen within the area of machine learning, especially with applications requiring large and fast computation, such as neural networks. Using the accumulated data knowledge from various machine learning applications, there is now the potential to create more intelligent memory with the capability for optimized trade-off between energy efficiency, area overhead, and classification accuracy on the learning systems. In this dissertation, a review of recently completed works involving video and machine learning memories will be covered. Based on the collected results from a variety of different methods, including: subjective trials, discovered data-mining patterns, software simulations, and hardware power and performance tests, the presented memories provide novel ways to significantly enhance power efficiency for future memory devices. An overview of related works, especially the relevant state-of-the-art research, will be referenced for comparison in order to produce memory design methodologies that exhibit optimal quality, low implementation overhead, and maximum power efficiency.National Science FoundationND EPSCoRCenter for Computationally Assisted Science and Technology (CCAST

    Robust low-power digital circuit design in nano-CMOS technologies

    Get PDF
    Device scaling has resulted in large scale integrated, high performance, low-power, and low cost systems. However the move towards sub-100 nm technology nodes has increased variability in device characteristics due to large process variations. Variability has severe implications on digital circuit design by causing timing uncertainties in combinational circuits, degrading yield and reliability of memory elements, and increasing power density due to slow scaling of supply voltage. Conventional design methods add large pessimistic safety margins to mitigate increased variability, however, they incur large power and performance loss as the combination of worst cases occurs very rarely. In-situ monitoring of timing failures provides an opportunity to dynamically tune safety margins in proportion to on-chip variability that can significantly minimize power and performance losses. We demonstrated by simulations two delay sensor designs to detect timing failures in advance that can be coupled with different compensation techniques such as voltage scaling, body biasing, or frequency scaling to avoid actual timing failures. Our simulation results using 45 nm and 32 nm technology BSIM4 models indicate significant reduction in total power consumption under temperature and statistical variations. Future work involves using dual sensing to avoid useless voltage scaling that incurs a speed loss. SRAM cache is the first victim of increased process variations that requires handcrafted design to meet area, power, and performance requirements. We have proposed novel 6 transistors (6T), 7 transistors (7T), and 8 transistors (8T)-SRAM cells that enable variability tolerant and low-power SRAM cache designs. Increased sense-amplifier offset voltage due to device mismatch arising from high variability increases delay and power consumption of SRAM design. We have proposed two novel design techniques to reduce offset voltage dependent delays providing a high speed low-power SRAM design. Increasing leakage currents in nano-CMOS technologies pose a major challenge to a low-power reliable design. We have investigated novel segmented supply voltage architecture to reduce leakage power of the SRAM caches since they occupy bulk of the total chip area and power. Future work involves developing leakage reduction methods for the combination logic designs including SRAM peripherals

    Robust Circuit Design for Low-Voltage VLSI.

    Full text link
    Voltage scaling is an effective way to reduce the overall power consumption, but the major challenges in low voltage operations include performance degradation and reliability issues due to PVT variations. This dissertation discusses three key circuit components that are critical in low-voltage VLSI. Level converters must be a reliable interface between two voltage domains, but the reduced on/off-current ratio makes it extremely difficult to achieve robust conversions at low voltages. Two static designs are proposed: LC2 adopts a novel pulsed-operation and modulates its pull-up strength depending on its state. A 3-sigma robustness is guaranteed using a current margin plot; SLC inherently reduces the contention by diode-insertion. Improvements in performance, power, and robustness are measured from 130nm CMOS test chips. SRAM is a major bottleneck in voltage-scaling due to its inherent ratioed-bitcell design. The proposed 7T SRAM alleviates the area overhead incurred by 8T bitcells and provides robust operation down to 0.32V in 180nm CMOS test chips with 3.35fW/bit leakage. Auto-Shut-Off provides a 6.8x READ energy reduction, and its innate Quasi-Static READ has been demonstrated which shows a much improved READ error rate. A use of PMOS Pass-Gate improves the half-select robustness by directly modulating the device strength through bitline voltage. Clocked sequential elements, flip-flops in short, are ubiquitous in today’s digital systems. The proposed S2CFF is static, single-phase, contention-free, and has the same number of devices as in TGFF. It shows a 40% power reduction as well as robust low-voltage operations in fabricated 45nm SOI test chips. Its simple hold-time path and the 3.4x improvement in 3-sigma hold-time is presented. A new on-chip flip-flop testing harness is also proposed, and measured hold-time variations of flip-flops are presented.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111525/1/yejoong_1.pd

    Expanded Noise Margin 10T SRAM Cell using Finfet Device

    Get PDF
    Static random access memory (SRAM) cells are being improved in order to increase resistance to device level changes and satisfy the requirements of low-power applications. A unique 10-transistor FinFET-based SRAM cell with single-ended read and differential write functionality is presented in this study. This cutting-edge architecture is more power-efficient than ST (Schmitt trigger) 10T or traditional 6T SRAM cells, using only 1.87 and 1.6 units of power respectively during read operations. The efficiency is attributable to a lower read activity factor, which saves electricity. The read static noise margin (RSNM) and write static noise margin (WSNM) of the proposed 10T SRAM cell show notable improvements over the 6T SRAM cell, increasing by 1.67 and 1.86, respectively. Additionally, compared to the 6T SRAM cell, the read access time has been significantly reduced by 1.96 seconds. Utilising the Cadence Virtuoso tool and an 18nm Advanced Node Process Design Kit (PDK) technology file, the design's efficacy has been confirmed. For low-power electronic systems and next-generation memory applications, this exciting 10T SRAM cell has a lot of potential

    Static random-access memory designs based on different FinFET at lower technology node (7nm)

    Get PDF
    Title from PDF of title page viewed January 15, 2020Thesis advisor: Masud H ChowdhuryVitaIncludes bibliographical references (page 50-57)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2019The Static Random-Access Memory (SRAM) has a significant performance impact on current nanoelectronics systems. To improve SRAM efficiency, it is important to utilize emerging technologies to overcome short-channel effects (SCE) of conventional CMOS. FinFET devices are promising emerging devices that can be utilized to improve the performance of SRAM designs at lower technology nodes. In this thesis, I present detail analysis of SRAM cells using different types of FinFET devices at 7nm technology. From the analysis, it can be concluded that the performance of both 6T and 8T SRAM designs are improved. 6T SRAM achieves a 44.97% improvement in the read energy compared to 8T SRAM. However, 6T SRAM write energy degraded by 3.16% compared to 8T SRAM. Read stability and write ability of SRAM cells are determined using Static Noise Margin and N- curve methods. Moreover, Monte Carlo simulations are performed on the SRAM cells to evaluate process variations. Simulations were done in HSPICE using 7nm Asymmetrical Underlap FinFET technology. The quasiplanar FinFET structure gained considerable attention because of the ease of the fabrication process [1] – [4]. Scaling of technology have degraded the performance of CMOS designs because of the short channel effects (SCEs) [5], [6]. Therefore, there has been upsurge in demand for FinFET devices for emerging market segments including artificial intelligence and cloud computing (AI) [8], [9], Internet of Things (IoT) [10] – [13] and biomedical [17] –[18] which have their own exclusive style of design. In recent years, many Underlapped FinFET devices were proposed to have better control of the SCEs in the sub-nanometer technologies [3], [4], [19] – [33]. Underlap on either side of the gate increases effective channel length as seen by the charge carriers. Consequently, the source-to-drain tunneling probability is improved. Moreover, edge direct tunneling leakage components can be reduced by controlling the electric field at the gate-drain junction . There is a limitation on the extent of underlap on drain or source sides because the ION is lower for larger underlap. Additionally, FinFET based designs have major width quantization issue. The width of a FinFET device increases only in quanta of silicon fin height (HFIN) [4]. The width quantization issue becomes critical for ratioed designs like SRAMs, where proper sizing of the transistors is essential for fault-free operation. FinFETs based on Design/Technology Co-Optimization (DTCO_F) approach can overcome these issues [38]. DTCO_F follows special design rules, which provides the specifications for the standard SRAM cells with special spacing rules and low leakages. The performances of 6T SRAM designs implemented by different FinFET devices are compared for different pull-up, pull down and pass gate transistor (PU: PD:PG) ratios to identify the best FinFET device for high speed and low power SRAM applications. Underlapped FinFETs (UF) and Design/Technology Co-Optimized FinFETs (DTCO_F) are used for the design and analysis. It is observed that with the PU: PD:PG ratios of 1:1:1 and 1:5:2 for the UF-SRAMs the read energy has degraded by 3.31% and 48.72% compared to the DTCO_F-SRAMs, respectively. However, the read energy with 2:5:2 ratio has improved by 32.71% in the UF-SRAM compared to the DTCO_F-SRAMs. The write energy with 1:1:1 configuration has improved by 642.27% in the UF-SRAM compared to the DTCO_F-SRAM. On the other hand, the write energy with 1:5:2 and 2:5:2 configurations have degraded by 86.26% and 96% in the UF-SRAMs compared to the DTCO_F-SRAMs. The stability and reliability of different SRAMs are also evaluated for 500mV supply. From the analysis, it can be concluded that Asymmetrical Underlapped FinFET is better for high-speed applications and DTCO FinFET for low power applications.Introduction -- Next generation high performance device: FinFET -- FinFET based SRAM bitcell designs -- Benchmarking of UF-SRAMs and DTCO-F-SRAMS -- Collaborative project -- Internship experience at INTEL and Marvell Semiconductor -- Conclusion and future wor
    • …
    corecore