176 research outputs found

    Benchmarking of standard-cell based memories in the sub-VT domain in 65-nm CMOS technology

    Get PDF
    In this paper, standard-cell based memories (SCMs) are proposed as an alternative to full-custom sub-VT SRAM macros for ultra-low-power systems requiring small memory blocks. The energy per memory access as well as the maximum achievable throughput in the sub-VT domain of various SCM architectures are evaluated by means of a gate-level sub-VT characterization model, building on data extracted from fully placed, routed, and back-annotated netlists. The reliable operation at the energy-minimum voltage of the various SCM architectures in a 65-nm CMOS technology considering within-die process parameter variations is demonstrated by means of Monte Carlo circuit simulation. Finally, the energy per memory access, the achievable throughput, and the area of the best SCM architecture are compared to recent sub-VT SRAM designs

    Static random-access memory designs based on different FinFET at lower technology node (7nm)

    Get PDF
    Title from PDF of title page viewed January 15, 2020Thesis advisor: Masud H ChowdhuryVitaIncludes bibliographical references (page 50-57)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2019The Static Random-Access Memory (SRAM) has a significant performance impact on current nanoelectronics systems. To improve SRAM efficiency, it is important to utilize emerging technologies to overcome short-channel effects (SCE) of conventional CMOS. FinFET devices are promising emerging devices that can be utilized to improve the performance of SRAM designs at lower technology nodes. In this thesis, I present detail analysis of SRAM cells using different types of FinFET devices at 7nm technology. From the analysis, it can be concluded that the performance of both 6T and 8T SRAM designs are improved. 6T SRAM achieves a 44.97% improvement in the read energy compared to 8T SRAM. However, 6T SRAM write energy degraded by 3.16% compared to 8T SRAM. Read stability and write ability of SRAM cells are determined using Static Noise Margin and N- curve methods. Moreover, Monte Carlo simulations are performed on the SRAM cells to evaluate process variations. Simulations were done in HSPICE using 7nm Asymmetrical Underlap FinFET technology. The quasiplanar FinFET structure gained considerable attention because of the ease of the fabrication process [1] – [4]. Scaling of technology have degraded the performance of CMOS designs because of the short channel effects (SCEs) [5], [6]. Therefore, there has been upsurge in demand for FinFET devices for emerging market segments including artificial intelligence and cloud computing (AI) [8], [9], Internet of Things (IoT) [10] – [13] and biomedical [17] –[18] which have their own exclusive style of design. In recent years, many Underlapped FinFET devices were proposed to have better control of the SCEs in the sub-nanometer technologies [3], [4], [19] – [33]. Underlap on either side of the gate increases effective channel length as seen by the charge carriers. Consequently, the source-to-drain tunneling probability is improved. Moreover, edge direct tunneling leakage components can be reduced by controlling the electric field at the gate-drain junction . There is a limitation on the extent of underlap on drain or source sides because the ION is lower for larger underlap. Additionally, FinFET based designs have major width quantization issue. The width of a FinFET device increases only in quanta of silicon fin height (HFIN) [4]. The width quantization issue becomes critical for ratioed designs like SRAMs, where proper sizing of the transistors is essential for fault-free operation. FinFETs based on Design/Technology Co-Optimization (DTCO_F) approach can overcome these issues [38]. DTCO_F follows special design rules, which provides the specifications for the standard SRAM cells with special spacing rules and low leakages. The performances of 6T SRAM designs implemented by different FinFET devices are compared for different pull-up, pull down and pass gate transistor (PU: PD:PG) ratios to identify the best FinFET device for high speed and low power SRAM applications. Underlapped FinFETs (UF) and Design/Technology Co-Optimized FinFETs (DTCO_F) are used for the design and analysis. It is observed that with the PU: PD:PG ratios of 1:1:1 and 1:5:2 for the UF-SRAMs the read energy has degraded by 3.31% and 48.72% compared to the DTCO_F-SRAMs, respectively. However, the read energy with 2:5:2 ratio has improved by 32.71% in the UF-SRAM compared to the DTCO_F-SRAMs. The write energy with 1:1:1 configuration has improved by 642.27% in the UF-SRAM compared to the DTCO_F-SRAM. On the other hand, the write energy with 1:5:2 and 2:5:2 configurations have degraded by 86.26% and 96% in the UF-SRAMs compared to the DTCO_F-SRAMs. The stability and reliability of different SRAMs are also evaluated for 500mV supply. From the analysis, it can be concluded that Asymmetrical Underlapped FinFET is better for high-speed applications and DTCO FinFET for low power applications.Introduction -- Next generation high performance device: FinFET -- FinFET based SRAM bitcell designs -- Benchmarking of UF-SRAMs and DTCO-F-SRAMS -- Collaborative project -- Internship experience at INTEL and Marvell Semiconductor -- Conclusion and future wor

    Vertical III-V Nanowire Transistors for Low-Power Logic and Reconfigurable Applications

    Get PDF
    With rapid increase in energy consumption of electronics used in our daily life, the building blocks — transistors — need to work in a way that has high energy efficiency and functional density to meet the demand of further scaling. III-V channel combined with vertical nanowire gate-all-around (GAA) device architecture is a promising alternative to conventional Si transistors due to its excellent electrical properties in the channel and electrostatic control across the gate oxide in addition to reduced footprint. Based on this platform, two major objectives of this thesis are included: 1) to improve the performance of III-V p-type metal-oxide-semiconductor field-effect transistors (MOSFETs) and tunnel FETs (TFETs) for low-power digital applications; 2) to integrate HfO2-based ferroelectric gate onto III-V FETs (FeFETs) and TFETs (ferro-TFETs) to enable reconfigurable operation for high functional density.The key bottleneck for all-III-V CMOS is its p-type MOSFETs (p-FETs) which are mainly made of GaSb or InGaSb. Rich surface states of III-Sb materials not only lead to decreased effective channel mobility due to more scattering, but also deteriorate the electrostatics. In this thesis, several approaches to improve p-FET performance have been explored. One strategy is to enhance the hole mobility by introducing compressive strain into III-Sb channel. For the first time, a high and uniform compressive strain near 1% along the transport direction has been achieved in downscaled GaSb nanowires by growing and engineering GaSb-GaAsSb core-shell structure, aiming for potential hole mobility enhancement. In addition, surface passivation using digital etch has been developed to improve the electrostatics with subthreshold swing (SS) down to 107 mV/dec. Moreover, the on-state performance including on-current (Ion) and transconductance (gm) have been enhanced by ∼50% using annealing with H2-based forming gas. Lastly, a novel p-FET structure with (In)GaAsSb channel has been developed and further improved off-state performance with SS = 71 mV/dec, which is the lowest value among all reported III-V p-FETs.Despite subthermionic operation, TFETs usually suffer from low drive current as well as the current operating below 60 mV/dec (I60). The second focus of this thesis is to fine-tune the InAs/(In)GaAsSb heterostructure tunnel junction and the doping in the source segment during epitaxy. As a result, a substantially increased I60 (>1 µA/µm) and Ion up to 40 µA/µm at source-drain bias of 0.5 V have been achieved, reaching a record compared to other reported TFETs.Finally, emerging ferroelectric oxide based on Zr-doped HfO2 (HZO) has been successfully integrated onto III-V vertical nanowire transistors to form FeFETs and ferro-TFETs with GAA architecture. The corresponding electrical performance and reliability have been carefully characterized with both DC and pulsed I-V measurements. The unique band-to-band tunneling in InAs/(In)GaAsSb/GaSb heterostructure TFET creates an ultrashort effective channel, leading to detection of localized potential variation induced by single domains and defects in nanoscale ferroelectric HZO without physical gate-length scaling. By introducing gate/source overlap structure in the ferro-TFET, non-volatile reconfigurable signal modulation with multiple modes including signal transmission, phase shift, frequency doubling, and mixing has been achieved in a single device with low drive voltage and only ∼0.01 µm2 footprint, thus increasing both functional density andenergy efficiency

    Two-Port Low-Power Gain-Cell Storage Array: Voltage Scaling and Retention Time

    Get PDF
    The impact of supply voltage scaling on the retention time of a 2-transistor (2T) gain-cell (GC) storage array is investigated, in order to enable low-power/low-voltage data storage. The retention time can be increased when scaling down the supply voltage for a given access statistics and a given write bit-line (WBL) control scheme. Moreover, for a given supply voltage, the retention time can be further increased by controlling the WBL to a voltage level between the supply rails during idle and read states. These two concepts are proved by means of Spectre simulation of a GC-storage array implemented in 180-nm CMOS technology. The proposed 2-kb storage macro is operated at only 40% of the nominal supply voltage and leverages the GCs to enable two-port operation with a negligible area-increase compared to a single-port implementation

    Mr.Wolf: An Energy-Precision Scalable Parallel Ultra Low Power SoC for IoT Edge Processing

    Get PDF
    This paper presents Mr. Wolf, a parallel ultra-low power (PULP) system on chip (SoC) featuring a hierarchical architecture with a small (12 kgates) microcontroller (MCU) class RISC-V core augmented with an autonomous IO subsystem for efficient data transfer from a wide set of peripherals. The small core can offload compute-intensive kernels to an eight-core floating-point capable of processing engine available on demand. The proposed SoC, implemented in a 40-nm LP CMOS technology, features a 108-mu W fully retentive memory (512 kB). The IO subsystem is capable of transferring up to 1.6 Gbit/s from external devices to the memory in less than 2.5 mW. The eight-core compute cluster achieves a peak performance of 850 million of 32-bit integer multiply and accumulate per second (MMAC/s) and 500 million of 32-bit floating-point multiply and accumulate per second (MFMAC/s) -1 GFlop/s-with an energy efficiency up to 15 MMAC/s/mW and 9 MFMAC/s/mW. These building blocks are supported by aggressive on-chip power conversion and management, enabling energy-proportional heterogeneous computing for always-on IoT end nodes improving performance by several orders of magnitude with respect to traditional single-core MCUs within a power envelope of 153 mW. We demonstrated the capabilities of the proposed SoC on a wide set of near-sensor processing kernels showing that Mr. Wolf can deliver performance up to 16.4 GOp/s with energy efficiency up to 274 MOp/s/mW on real-life applications, paving the way for always-on data analytics on high-bandwidth sensors at the edge of the Internet of Things

    Low Power 6-Transistor Latch Design for Portable Devices

    Get PDF
    The latest advances in mobile battery-powered devices such as the Personal Digital Assistant (PDA) and mobile phones have set new goals in digital VLSI design. The portable devices require high speed and low power consumption. Even low power consumption is the dominant requirement and to do so speed can be compromised. In this paper a novel area efficient latch design is proposed. The simulation results show that the proposed design with less transistor count is better choice for low power and high speed portable applications. Keywords: Latch, Low power, Portable, 8T, 6T, Power consumption, Delay

    Tunnel Field Effect Transistors:from Steep-Slope Electronic Switches to Energy Efficient Logic Applications

    Get PDF
    The aim of this work has been the investigation of homo-junction Tunnel Field Effect Transistors starting from a compact modelling perspective to its possible applications. Firstly a TCAD based simulation study is done to explain the main device characteristics. The main differences of a Tunnel FET with respect to a conventional MOSFET is pointed out and the differences have been explained. A compact DC/AC model has been developed which is capable of describing the I-V characteristics in all regimes of operation. The model takes in to account ambi-polarity, drain side breakdown and all tunneling related physics. A temperature dependence is also added to the model to study the temperature independent behavior of tunneling. The model was further implemented in a Verilog-A based circuit simulator. Following calibration to experimental results of Silicon and strained-Silicon TFETs, the model has been also used to benchmark against a standard CMOS node for digital and analog applications. The circuits built with Tunnel FETs showed interesting temperature behavior which was superior to the compared CMOS node. In the same work, we also explore and propose solutions for using TFETs for low power memory applications. Both volatile and non-volatile memory concepts are investigated and explored. The application of a Tunnel FET as a capacitor-less memory has been experimentally demonstrated for the first time. New device concepts have been proposed and process flows for the same are developed to realize them in the clean room in EPFL

    Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors

    Full text link
    Edge devices equipped with computer vision must deal with vast amounts of sensory data with limited computing resources. Hence, researchers have been exploring different energy-efficient solutions such as near-sensor processing, in-sensor processing, and in-pixel processing, bringing the computation closer to the sensor. In particular, in-pixel processing embeds the computation capabilities inside the pixel array and achieves high energy efficiency by generating low-level features instead of the raw data stream from CMOS image sensors. Many different in-pixel processing techniques and approaches have been demonstrated on conventional frame-based CMOS imagers, however, the processing-in-pixel approach for neuromorphic vision sensors has not been explored so far. In this work, we for the first time, propose an asynchronous non-von-Neumann analog processing-in-pixel paradigm to perform convolution operations by integrating in-situ multi-bit multi-channel convolution inside the pixel array performing analog multiply and accumulate (MAC) operations that consume significantly less energy than their digital MAC alternative. To make this approach viable, we incorporate the circuit's non-ideality, leakage, and process variations into a novel hardware-algorithm co-design framework that leverages extensive HSpice simulations of our proposed circuit using the GF22nm FD-SOI technology node. We verified our framework on state-of-the-art neuromorphic vision sensor datasets and show that our solution consumes ~2x lower backend-processor energy while maintaining almost similar front-end (sensor) energy on the IBM DVS128-Gesture dataset than the state-of-the-art while maintaining a high test accuracy of 88.36%.Comment: 17 pages, 11 figures, 2 table
    • …
    corecore