58 research outputs found

    In-Memory Computing Using Formal Methods and Paths-Based Logic

    Get PDF
    The continued scaling of the CMOS device has been largely responsible for the increase in computational power and consequent technological progress over the last few decades. However, the end of Dennard scaling has interrupted this era of sustained exponential growth in computing performance. Indeed, we are quickly reaching an impasse in the form of limitations in the lithographic processes used to fabricate CMOS processes and, even more dire, we are beginning to face fundamental physical phenomena, such as quantum tunneling, that are pervasive at the nanometer scale. Such phenomena manifests itself in prohibitively high leakage currents and process variations, leading to inaccurate computations. As a result, there has been a surge of interest in computing architectures that can replace the traditional CMOS transistor-based methods. This thesis is a thorough investigation of how computations can be performed on one such architecture, called a crossbar. The methods proposed in this document apply to any crossbar consisting of two-terminal connective devices. First, we demonstrate how paths of electric current between two wires can be used as design primitives in a crossbar. We then leverage principles from the field of formal methods, in particular the area of bounded model checking, to automate the synthesis of crossbar designs for computing arithmetic operations. We demonstrate that our approach yields circuits that are state-of-the-art in terms of the number of operations required to perform a computation. Finally, we look at the benefits of using a 3D crossbar for computation; that is, a crossbar consisting of multiple layers of interconnects. A novel 3D crossbar computing paradigm is proposed for solving the Boolean matrix multiplication and transitive closure problems and we show how this paradigm can be utilized, with small modifications, in the XPoint crossbar memory architecture that was recently announced by Intel

    Low-power emerging memristive designs towards secure hardware systems for applications in internet of things

    Get PDF
    Emerging memristive devices offer enormous advantages for applications such as non-volatile memories and in-memory computing (IMC), but there is a rising interest in using memristive technologies for security applications in the era of internet of things (IoT). In this review article, for achieving secure hardware systems in IoT, low-power design techniques based on emerging memristive technology for hardware security primitives/systems are presented. By reviewing the state-of-the-art in three highlighted memristive application areas, i.e. memristive non-volatile memory, memristive reconfigurable logic computing and memristive artificial intelligent computing, their application-level impacts on the novel implementations of secret key generation, crypto functions and machine learning attacks are explored, respectively. For the low-power security applications in IoT, it is essential to understand how to best realize cryptographic circuitry using memristive circuitries, and to assess the implications of memristive crypto implementations on security and to develop novel computing paradigms that will enhance their security. This review article aims to help researchers to explore security solutions, to analyze new possible threats and to develop corresponding protections for the secure hardware systems based on low-cost memristive circuit designs

    Low Power Memory/Memristor Devices and Systems

    Get PDF
    This reprint focusses on achieving low-power computation using memristive devices. The topic was designed as a convenient reference point: it contains a mix of techniques starting from the fundamental manufacturing of memristive devices all the way to applications such as physically unclonable functions, and also covers perspectives on, e.g., in-memory computing, which is inextricably linked with emerging memory devices such as memristors. Finally, the reprint contains a few articles representing how other communities (from typical CMOS design to photonics) are fighting on their own fronts in the quest towards low-power computation, as a comparison with the memristor literature. We hope that readers will enjoy discovering the articles within

    Techniques of Energy-Efficient VLSI Chip Design for High-Performance Computing

    Get PDF
    How to implement quality computing with the limited power budget is the key factor to move very large scale integration (VLSI) chip design forward. This work introduces various techniques of low power VLSI design used for state of art computing. From the viewpoint of power supply, conventional in-chip voltage regulators based on analog blocks bring the large overhead of both power and area to computational chips. Motivated by this, a digital based switchable pin method to dynamically regulate power at low circuit cost has been proposed to make computing to be executed with a stable voltage supply. For one of the widely used and time consuming arithmetic units, multiplier, its operation in logarithmic domain shows an advantageous performance compared to that in binary domain considering computation latency, power and area. However, the introduced conversion error reduces the reliability of the following computation (e.g. multiplication and division.). In this work, a fast calibration method suppressing the conversion error and its VLSI implementation are proposed. The proposed logarithmic converter can be supplied by dc power to achieve fast conversion and clocked power to reduce the power dissipated during conversion. Going out of traditional computation methods and widely used static logic, neuron-like cell is also studied in this work. Using multiple input floating gate (MIFG) metal-oxide semiconductor field-effect transistor (MOSFET) based logic, a 32-bit, 16-operation arithmetic logic unit (ALU) with zipped decoding and a feedback loop is designed. The proposed ALU can reduce the switching power and has a strong driven-in capability due to coupling capacitors compared to static logic based ALU. Besides, recent neural computations bring serious challenges to digital VLSI implementation due to overload matrix multiplications and non-linear functions. An analog VLSI design which is compatible to external digital environment is proposed for the network of long short-term memory (LSTM). The entire analog based network computes much faster and has higher energy efficiency than the digital one

    Architecture and Circuit Design Optimization for Compute-In-Memory

    Get PDF
    The objective of the proposed research is to optimize computing-in-memory (CIM) design for accelerating Deep Neural Network (DNN) algorithms. As compute peripheries such as analog-to-digital converter (ADC) introduce significant overhead in CIM inference design, the research first focuses on the circuit optimization for inference acceleration and proposes a resistive random access memory (RRAM) based ADC-free in-memory compute scheme. We comprehensively explore the trade-offs involving different types of ADCs and investigate a new ADC design especially suited for the CIM, which performs the analog shift-add for multiple weight significance bits, improving the throughput and energy efficiency under similar area constraints. Furthermore, we prototype an ADC-free CIM inference chip design with a fully-analog data processing manner between sub-arrays, which can significantly improve the hardware performance over the conventional CIM designs and achieve near-software classification accuracy on ImageNet and CIFAR-10/-100 dataset. Secondly, the research focuses on hardware support for CIM on-chip training. To maximize hardware reuse of CIM weight stationary dataflow, we propose the CIM training architectures with the transpose weight mapping strategy. The cell design and periphery circuitry are modified to efficiently support bi-directional compute. A novel solution of signed number multiplication is also proposed to handle the negative input in backpropagation. Finally, we propose an SRAM-based CIM training architecture and comprehensively explore the system-level hardware performance for DNN on-chip training based on silicon measurement results.Ph.D
    corecore