10 research outputs found

    Hybrid CMOS/memristor circuits

    Full text link
    Abstract — This is a brief review of recent work on the prospective hybrid CMOS/memristor circuits. Such hybrids combine the flexibility, reliability and high functionality of the CMOS subsystem with very high density of nanoscale thin film resistance switching devices operating on different physical principles. Simulation and initial experimental results demonstrate that performance of CMOS/memristor circuits for several important applications is well beyond scaling limits of conventional VLSI paradigm. I

    ITERATIVE HEURISTICS FOR CMOL HYBRID CMOS/NANODEVICES CELLS MAPPING

    Get PDF

    A Phase Change Memory and DRAM Based Framework For Energy-Efficient and High-Speed In-Memory Stochastic Computing

    Get PDF
    Convolutional Neural Networks (CNNs) have proven to be highly effective in various fields related to Artificial Intelligence (AI) and Machine Learning (ML). However, the significant computational and memory requirements of CNNs make their processing highly compute and memory-intensive. In particular, the multiply-accumulate (MAC) operation, which is a fundamental building block of CNNs, requires enormous arithmetic operations. As the input dataset size increases, the traditional processor-centric von-Neumann computing architecture becomes ill-suited for CNN-based applications. This results in exponentially higher latency and energy costs, making the processing of CNNs highly challenging. To overcome these challenges, researchers have explored the Processing-In Memory (PIM) technique, which involves placing the processing unit inside or near the memory unit. This approach reduces data migration length and utilizes the internal memory bandwidth at the memory chip level. However, developing a reliable PIM-based system with minimal hardware modifications and design complexity remains a significant challenge. The proposed solution in the report suggests utilizing different memory technologies, such as Dynamic RAM (DRAM) and phase change memory (PCM), with Stochastic arithmetic and minimal add-on logic. Stochastic computing is a technique that uses random numbers to perform arithmetic operations instead of traditional binary representation. This technique reduces hardware requirements for CNN\u27s arithmetic operations, making it possible to implement them with minimal add-on logic. The report details the workflow for performing arithmetical operations used by CNNs, including MAC, activation, and floating-point functions. The proposed solution includes designs for scalable Stochastic Number Generator (SNG), DRAM CNN accelerator, non-volatile memory (NVM) class PCRAM-based CNN accelerator, and DRAM-based stochastic to binary conversion (StoB) for in-situ deep learning. These designs utilize stochastic computing to reduce the hardware requirements for CNN\u27s arithmetic operations and enable energy and time-efficient processing of CNNs. The report also identifies future research directions for the proposed designs, including in-situ PCRAM-based SNG, ODIN (A Bit-Parallel Stochastic Arithmetic Based Accelerator for In-Situ Neural Network Processing in Phase Change RAM), ATRIA (Bit-Parallel Stochastic Arithmetic Based Accelerator for In-DRAM CNN Processing), and AGNI (In-Situ, Iso-Latency Stochastic-to-Binary Number Conversion for In-DRAM Deep Learning), and presents initial findings for these ideas. In summary, the proposed solution in the report offers a comprehensive approach to address the challenges of processing CNNs, and the proposed designs have the potential to improve the energy and time efficiency of CNNs significantly. Using Stochastic Computing and different memory technologies enables the development of reliable PIM-based systems with minimal hardware modifications and design complexity, providing a promising path for the future of CNN-based applications

    Abusing Hardware Race Conditions for High Throughput Energy Efficient Computation

    Get PDF
    We propose a novel computing approach, called “Race Logic”, which utilizes a new data representation to accelerate a broad class of optimization problems, such as those solved by dynamic programming algorithms. The core idea of Race Logic is to deliberately engineer race conditions in a circuit to perform useful computation. In Race Logic, information, instead of being represented as logic levels (as is done in conventional logic), is represented as a timing delay. Computations can then be performed by observing the relative propagation times of signals injected into a configurable circuit (i.e. the outcome of races through the circuit).In this dissertation I will introduce Race Based computation and talk about multiple VLSI implementations. We first begin by considering a synchronous approach, which uses simple clocked delay elements. Though this synchronous implementation outperforms highly optimized conventional implementations of the well-studied, DNA sequence alignment problem, its third order energy scaling with problem size and limited dynamic range of timing delays are its major pitfalls. Next, in the search for energy efficiency, we study asynchronous designs in order to understand the performance trade-offs and applicability of this new architecture. Finally, I will present the results of a prototype asynchronous Race Logic chip and demonstrate that Race-Based computations can align up to 10 million 50 symbol long DNA sequences per second, about 2-3 orders of magnitude faster than the state of the art general purpose computing systems

    Memristores

    Get PDF
    Mestrado em Engenharia Eletrónica e TelecomunicaçõesThe memristor was proposed by Leon Chua in 1971 only for the sake of mathematical complement, an idea that was not widely accepted by the scientific community. Only decades later, after HP’s announcement in 2008 is that the memristors started to be seen as realizable elements and not as mere mathematical curiosities. These devices feature distinct characteristics from the other known electronic devices. Besides being passive, they are characterized by the following postulates: the existence of a characteristic voltage-current loop with hysteresis and single valued in the origin, gradual decrease of the area defined by the loop with the increasing of the frequency and simply resistive behaviour for infinite frequency. As a memristive device’s response depends greatly on the amplitude and frequency characteristics of the input signal and its own internal characteristics. Therefore there is a clear need to find procedures and attributes that allow to classify and categorize various memristive devices. These attributes, in their essence, similar to the figures of merit of devices like diodes and transistors, will allow in the near future to better choose memristive devices for specific applications. To try to obtain these attributes, a morphologic analysis of the voltage-current loops’ area and length of several theoretical memristive devices models was made in MATLAB changing its internal characteristics, for arrays of frequency and amplitude values of the input signal. Afterwards, a memristor device emulator was built to corroborate the theoretical results obtained. To this end the voltage-current loops for several input values were measured and the calculation of the loops’ areas and lengths was effectuated.O memristor foi proposto por Leon Chua em 1971 apenas por uma questão de complemento matemático, uma ideia que não teve grande aceitação na comunidade científica. Só décadas mais tarde, depois do anúncio da HP em 2008 é que os memristors começaram a ser vistos como elementos realizáveis e não como meras curiosidades matemáticas. Estes dispositivos apresentam características distintas dos demais dispositivos eletrónicos conhecidos. Além de serem elementos passivos, são caracterizados pelos seguintes postulados: existência de uma curva característica tensão-corrente com histerese e valor único na origem, diminuição gradual da área definida por esta curva com o aumento da frequência e comportamento puramente resistivo do memristor quando a frequência tende para infinito. A resposta dos dispositivos memristivos depende bastante das características de amplitude e frequência do sinal de entrada e das suas próprias características internas. Por isso, há uma clara necessidade de descobrir procedimentos e atributos que permitam classificar e categorizar diferentes dispositivos memristivos. Estes atributos, na sua essência, semelhantes às figuras de mérito de dispositivos como díodos ou transístores, permitirão num futuro próximo selecionar dispositivos memristivos para aplicações específicas. Para tentar obter estes atributos, realizou-se uma análise morfológica da área e comprimento das curvas tensão-corrente de vários modelos teóricos de dispositivos memristivos em MATLAB variando as suas características internas, para conjuntos de valores de frequência e amplitude do sinal de entrada. De seguida construiu-se um emulador de um dispositivo memristivo para corroborar os resultados teóricos obtidos. Para tal mediram-se as curvas de tensão-corrente para vários valores de entrada e efetuou-se o cálculo das áreas e comprimentos dessas curvas

    CMOL FPGA circuits

    No full text
    Abstract — This paper describes an architecture of FPGAlike fabric for future hybrid “CMOL ” circuits. Such circuits will combine a semiconductor-transistor (CMOS) stack and a two-level nanowire crossbar with molecular-scale two-terminal nanodevices (programmable diodes) formed at each crosspoint. We have developed a custom set of tools for CMOL FPGA circuit design automation, and used it for the evaluation of performance of these circuits for the Toronto 20 benchmark set, so far without optimization of several parameters including the power supply voltage, nanowire pitch and maximum NOR fan-in. The results show that even without such optimization, CMOL FPGA circuits may provide a density advantage of more than two orders of magnitude over the traditional CMOS FPGA with the same CMOS design rules, at comparable time delay, acceptable power consumption, and potentially high defect tolerance. I

    Fine-Grained Defect Diagnosis for CMOL FPGA Circuits

    No full text

    High-Throughput Pattern Matching With CMOL FPGA Circuits: Case for Logic-in-Memory Computing

    No full text
    corecore