1,486 research outputs found

    X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories

    Get PDF
    Silicon-based Static Random Access Memories (SRAM) and digital Boolean logic have been the workhorse of the state-of-art computing platforms. Despite tremendous strides in scaling the ubiquitous metal-oxide-semiconductor transistor, the underlying \textit{von-Neumann} computing architecture has remained unchanged. The limited throughput and energy-efficiency of the state-of-art computing systems, to a large extent, results from the well-known \textit{von-Neumann bottleneck}. The energy and throughput inefficiency of the von-Neumann machines have been accentuated in recent times due to the present emphasis on data-intensive applications like artificial intelligence, machine learning \textit{etc}. A possible approach towards mitigating the overhead associated with the von-Neumann bottleneck is to enable \textit{in-memory} Boolean computations. In this manuscript, we present an augmented version of the conventional SRAM bit-cells, called \textit{the X-SRAM}, with the ability to perform in-memory, vector Boolean computations, in addition to the usual memory storage operations. We propose at least six different schemes for enabling in-memory vector computations including NAND, NOR, IMP (implication), XOR logic gates with respect to different bit-cell topologies −- the 8T cell and the 8+^+T Differential cell. In addition, we also present a novel \textit{`read-compute-store'} scheme, wherein the computed Boolean function can be directly stored in the memory without the need of latching the data and carrying out a subsequent write operation. The feasibility of the proposed schemes has been verified using predictive transistor models and Monte-Carlo variation analysis.Comment: This article has been accepted in a future issue of IEEE Transactions on Circuits and Systems-I: Regular Paper

    Ultra-Low-Power Embedded SRAM Design for Battery- Operated and Energy-Harvested IoT Applications

    Get PDF
    Internet of Things (IoT) devices such as wearable health monitors, augmented reality goggles, home automation, smart appliances, etc. are a trending topic of research. Various IoT products are thriving in the current electronics market. The IoT application needs such as portability, form factor, weight, etc. dictate the features of such devices. Small, portable, and lightweight IoT devices limit the usage of the primary energy source to a smaller rechargeable or non-rechargeable battery. As battery life and replacement time are critical issues in battery-operated or partially energy-harvested IoT devices, ultra-low-power (ULP) system on chips (SoC) are becoming a widespread solution of chip makers’ choice. Such ULP SoC requires both logic and the embedded static random access memory (SRAM) in the processor to operate at very low supply voltages. With technology scaling for bulk and FinFET devices, logic has demonstrated to operate at low minimum operating voltages (VMIN). However, due to process and temperature variation, SRAMs have higher VMIN in scaled processes that become a huge problem in designing ULP SoC cores. This chapter discusses the latest published circuits and architecture techniques to minimize the SRAM VMIN for scaled bulk and FinFET technologies and improve battery life for ULP IoT applications

    Robust low-power digital circuit design in nano-CMOS technologies

    Get PDF
    Device scaling has resulted in large scale integrated, high performance, low-power, and low cost systems. However the move towards sub-100 nm technology nodes has increased variability in device characteristics due to large process variations. Variability has severe implications on digital circuit design by causing timing uncertainties in combinational circuits, degrading yield and reliability of memory elements, and increasing power density due to slow scaling of supply voltage. Conventional design methods add large pessimistic safety margins to mitigate increased variability, however, they incur large power and performance loss as the combination of worst cases occurs very rarely. In-situ monitoring of timing failures provides an opportunity to dynamically tune safety margins in proportion to on-chip variability that can significantly minimize power and performance losses. We demonstrated by simulations two delay sensor designs to detect timing failures in advance that can be coupled with different compensation techniques such as voltage scaling, body biasing, or frequency scaling to avoid actual timing failures. Our simulation results using 45 nm and 32 nm technology BSIM4 models indicate significant reduction in total power consumption under temperature and statistical variations. Future work involves using dual sensing to avoid useless voltage scaling that incurs a speed loss. SRAM cache is the first victim of increased process variations that requires handcrafted design to meet area, power, and performance requirements. We have proposed novel 6 transistors (6T), 7 transistors (7T), and 8 transistors (8T)-SRAM cells that enable variability tolerant and low-power SRAM cache designs. Increased sense-amplifier offset voltage due to device mismatch arising from high variability increases delay and power consumption of SRAM design. We have proposed two novel design techniques to reduce offset voltage dependent delays providing a high speed low-power SRAM design. Increasing leakage currents in nano-CMOS technologies pose a major challenge to a low-power reliable design. We have investigated novel segmented supply voltage architecture to reduce leakage power of the SRAM caches since they occupy bulk of the total chip area and power. Future work involves developing leakage reduction methods for the combination logic designs including SRAM peripherals

    An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

    Full text link
    Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper

    Design issues in cross-coupled inverter sense amplifier

    Get PDF
    This paper presents an analytical approach to the design of CMOS cross-coupled inverter sense amplifiers. The effects of the equilibrating transistors and the tail current source on the speed of the sense amplifier are analyzed. An analysis of the offset due to mismatch in various parameters is performed, showing that a complete offset analysis has to account for the cell and bitline structure. A new figure of merit for the offset in the sense amplifier and several new design insights are introduced

    6T-SRAM 1Mb Design with Test Structures and Post Silicon Validation

    Get PDF
    abstract: Static random-access memories (SRAM) are integral part of design systems as caches and data memories that and occupy one-third of design space. The work presents an embedded low power SRAM on a triple well process that allows body-biasing control. In addition to the normal mode operation, the design is embedded with Physical Unclonable Function (PUF) [Suh07] and Sense Amplifier Test (SA Test) mode. With PUF mode structures, the fabrication and environmental mismatches in bit cells are used to generate unique identification bits. These bits are fixed and known as preferred state of an SRAM bit cell. The direct access test structure is a measurement unit for offset voltage analysis of sense amplifiers. These designs are manufactured using a foundry bulk CMOS 55 nm low-power (LP) process. The details about SRAM bit-cell and peripheral circuit design is discussed in detail, for certain cases the circuit simulation analysis is performed with random variations embedded in SPICE models. Further, post-silicon testing results are discussed for normal operation of SRAMs and the special test modes. The silicon and circuit simulation results for various tests are presented.Dissertation/ThesisMasters Thesis Electrical Engineering 201
    • 

    corecore