151 research outputs found

    An Energy-Efficient Design Paradigm for a Memory Cell Based on Novel Nanoelectromechanical Switches

    Get PDF
    In this chapter, we explain NEMsCAM cell, a new content-addressable memory (CAM) cell, which is designed based on both CMOS technologies and nanoelectromechanical (NEM) switches. The memory part of NEMsCAM is designed with two complementary nonvolatile NEM switches and located on top of the CMOS-based comparison component. As a use case, we evaluate first-level instruction and data translation lookaside buffers (TLBs) with 16 nm CMOS technology at 2 GHz. The simulation results demonstrate that the NEMsCAM TLB reduces the energy consumption per search operation (by 27%), standby mode (by 53.9%), write operation (by 41.9%), and the area (by 40.5%) compared to a CMOS-only TLB with minimal performance overhead

    Array-based architecture for FET-based, nanoscale electronics

    Get PDF
    Advances in our basic scientific understanding at the molecular and atomic level place us on the verge of engineering designer structures with key features at the single nanometer scale. This offers us the opportunity to design computing systems at what may be the ultimate limits on device size. At this scale, we are faced with new challenges and a new cost structure which motivates different computing architectures than we found efficient and appropriate in conventional very large scale integration (VLSI). We sketch a basic architecture for nanoscale electronics based on carbon nanotubes, silicon nanowires, and nano-scale FETs. This architecture can provide universal logic functionality with all logic and signal restoration operating at the nanoscale. The key properties of this architecture are its minimalism, defect tolerance, and compatibility with emerging bottom-up nanoscale fabrication techniques. The architecture further supports micro-to-nanoscale interfacing for communication with conventional integrated circuits and bootstrap loading

    System-level Prototyping with HyperTransport

    Get PDF
    The complexity of computer systems continues to increase. Emulation of proposed subsystems is one way to manage this growing complexity when evaluating the performance of proposed architectures. HyperTransport allows researchers to connect directly to microprocessors with FPGAs. This enables the emulation of novel memory hierarchies, non-volatile memory designs, coprocessors, and other architectural changes, combined with an existing system

    NEMsCAM: A novel CAM cell based on nano-electro-mechanical switch and CMOS for energy efficient TLBs

    Get PDF
    In this paper we propose a novel Content Addressable Memory (CAM) cell, NEMsCAM, based on both Nano-electro-mechanical (NEM) switches and CMOS technologies. The memory component of the proposed CAM cell is designed with two complementary non-volatile NEM switches and located on top of the CMOS-based comparison component. As a use case for the NEMsCAM cell, we design first-level data and instruction Translation Lookaside Buffers (TLBs) with 16nm CMOS technology at 2GHz. The simulations show that the NEMsCAM TLB reduces the energy consumption per search operation (by 27%), write operation (by 41.9%) and standby mode (by 53.9%), and the area (by 40.5%) compared to a CMOS-only TLB with minimal performance overhead.We thank all anonymous reviewers for their insightful comments. This work is supported in part by the European Union (FEDER funds) under contract TIN2012-34557, and the European Union’s Seventh Framework Programme (FP7/2007-2013) under the ParaDIME project (GA no. 318693)Postprint (author's final draft

    Emerging Run-Time Reconfigurable FPGA and CAD Tools

    Get PDF
    Field-programmable gate array (FPGA) is a post fabrication reconfigurable device to accelerate domain specific computing systems. It offers offer high operation speed and low power consumption. However, the design flexibility and performance of FPGAs are severely constrained by the costly on-chip memories, e.g. static random access memory (SRAM) and FLASH memory. The objective of my dissertation is to explore the opportunity and enable the use of the emerging resistance random access memory (ReRAM) in FPGA design. The emerging ReRAM technology features high storage density, low access power consumption, and CMOS compatibility, making it a promising candidate for FPGA implementation. In particular, ReRAM has advantages of the fast access and nonvolatility, enabling the on-chip storage and access of configuration data. In this dissertation, I first propose a novel three-dimensional stacking scheme, namely, high-density interleaved memory (HIM). The structure improves the density of ReRAM meanwhile effectively reducing the signal interference induced by sneak paths in crossbar arrays. To further enhance the access speed and design reliability, a fast sensing circuit is also presented which includes a new sense amplifier scheme and reference cell configuration. The proposed ReRAM FPGA leverages a similar architecture as conventional SRAM based FPGAs but utilizes ReRAM technology in all component designs. First, HIM is used to implement look-up table (LUT) and block random access memories (BRAMs) for func- tionality process. Second, a 2R1T, two ReRAM cells and one transistor, nonvolatile switch design is applied to construct connection blocks (CBs) and switch blocks (SBs) for signal transition. Furthermore, unified BRAM (uBRAM) based on the current BRAM architecture iv is introduced, offering both configuration and temporary data storage. The uBRAMs provides extremely high density effectively and enlarges the FPGA capacity, potentially saving multiple contexts of configuration. The fast configuration scheme from uBRAM to logic and routing components also makes fast run-time partial reconfiguration (PR) much easier, improving the flexibility and performance of the entire FPGA system. Finally, modern place and route tools are designed for homogeneous fabric of FPGA. The PR feature, however, requires the support of heterogeneous logic modules in order to differentiate PR modules from static ones and therefore maintain the signal integration. The existing approaches still reply on designers’ manual effort, which significantly prolongs design time and lowers design efficiency. In this dissertation, I integrate PR support into VPR – an academic place and route tool by introducing a B*-tree modular placer (BMP) and PR-aware router. As such, users are able to explore new architectures or map PR applications to a variety of FPGAs. More importantly, this enhanced feature can also support fast design automation, e.g. mapping IP core, loading pre-synthesizing logic modules, etc

    Power Profile Obfuscation using RRAMs to Counter DPA Attacks

    Get PDF
    Side channel attacks, such as Differential Power Analysis (DPA), denote a special class of attacks in which sensitive key information is unveiled through information extracted from the physical device executing a cryptographic algorithm. This information leakage, known as side channel information, occurs from computations in a non-ideal system composed of electronic devices such as transistors. Power dissipation is one classic side channel source, which relays information of the data being processed. DPA uses statistical analysis to identify data-dependent correlations in sets of power measurements. Countermeasures against DPA focus on hiding or masking techniques at different levels of design abstraction and are typically associated with high power and area cost. Emerging technologies such as Resistive Random Access Memory (RRAM), offer unique opportunities to mitigate DPAs with their inherent memristor device characteristics such as variability in write time, ultra low power (0.1-3 pJ/bit), and high density (4F2). In this research, an RRAM based architecture is proposed to mitigate the DPA attacks by obfuscating the power profile. Specifically, a dual RRAM based memory module masks the power dissipation of the actual transaction by accessing both the data and its complement from the memory in tandem. DPA attack resiliency for a 128-bit AES cryptoprocessor using RRAM and CMOS memory modules is compared against baseline CMOS only technology. In the proposed AES architecture, four single port RRAM memory units store the intermediate state of the encryption. The correlation between the state data and sets of power measurement is masked due to power dissipated from inverse data access on dual RRAM memory. A customized simulation framework is developed to design the attack scenarios using Synopsys and Cadence tool suites, along with a Hamming weight DPA attack module. The attack mounted on a baseline CMOS architecture is successful and the full key is recovered. However, DPA attacks mounted on the dual CMOS and RRAM based AES cryptoprocessor yielded unsuccessful results with no keys recovered, demonstrating the resiliency of the proposed architecture against DPA attacks

    Digital Simulations of Memristors Towards Integration with Reconfigurable Computing

    Get PDF
    The end of Moore’s Law has been predicted for decades. Demand for increased parallel computational performance has been increased by improvements in machine learning. This past decade has demonstrated the ever-increasing creativity and effort necessary to extract scaling improvements in CMOS fabrication processes. However, CMOS scaling is nearing its fundamental physical limits. A viable path for increasing performance is to break the von Neumann bottleneck. In-memory computing using emerging memory technologies (e.g. ReRam, STT, MRAM) offers a potential path beyond the end of Moore’s Law. However, there is currently very little support from industry tools for designers wishing to incorporate these devices and novel architectures. The primary issue for those using these tools is the lack of support for mixed-signal design, as HDLs such as Verilog were designed to work only with digital components. This work aims to improve the ability for designers to rapidly prototype their designs using these emerging memory devices, specifically memristors, by extending Verilog to support functional simulation of memristors with the Verilog Procedural Interface (VPI). In this work, demonstrations of the ability for the VPI to simulate memristors with the nonlinear ion-drift model and the behavior of a memristive crossbar array are presented
    • …
    corecore