143 research outputs found

    Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

    Get PDF
    Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

    Neuro-memristive Circuits for Edge Computing: A review

    Full text link
    The volume, veracity, variability, and velocity of data produced from the ever-increasing network of sensors connected to Internet pose challenges for power management, scalability, and sustainability of cloud computing infrastructure. Increasing the data processing capability of edge computing devices at lower power requirements can reduce several overheads for cloud computing solutions. This paper provides the review of neuromorphic CMOS-memristive architectures that can be integrated into edge computing devices. We discuss why the neuromorphic architectures are useful for edge devices and show the advantages, drawbacks and open problems in the field of neuro-memristive circuits for edge computing

    Algorithm and Hardware Co-design for Learning On-a-chip

    Get PDF
    abstract: Machine learning technology has made a lot of incredible achievements in recent years. It has rivalled or exceeded human performance in many intellectual tasks including image recognition, face detection and the Go game. Many machine learning algorithms require huge amount of computation such as in multiplication of large matrices. As silicon technology has scaled to sub-14nm regime, simply scaling down the device cannot provide enough speed-up any more. New device technologies and system architectures are needed to improve the computing capacity. Designing specific hardware for machine learning is highly in demand. Efforts need to be made on a joint design and optimization of both hardware and algorithm. For machine learning acceleration, traditional SRAM and DRAM based system suffer from low capacity, high latency, and high standby power. Instead, emerging memories, such as Phase Change Random Access Memory (PRAM), Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM), and Resistive Random Access Memory (RRAM), are promising candidates providing low standby power, high data density, fast access and excellent scalability. This dissertation proposes a hierarchical memory modeling framework and models PRAM and STT-MRAM in four different levels of abstraction. With the proposed models, various simulations are conducted to investigate the performance, optimization, variability, reliability, and scalability. Emerging memory devices such as RRAM can work as a 2-D crosspoint array to speed up the multiplication and accumulation in machine learning algorithms. This dissertation proposes a new parallel programming scheme to achieve in-memory learning with RRAM crosspoint array. The programming circuitry is designed and simulated in TSMC 65nm technology showing 900X speedup for the dictionary learning task compared to the CPU performance. From the algorithm perspective, inspired by the high accuracy and low power of the brain, this dissertation proposes a bio-plausible feedforward inhibition spiking neural network with Spike-Rate-Dependent-Plasticity (SRDP) learning rule. It achieves more than 95% accuracy on the MNIST dataset, which is comparable to the sparse coding algorithm, but requires far fewer number of computations. The role of inhibition in this network is systematically studied and shown to improve the hardware efficiency in learning.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Exploiting Nanoelectronic Properties of Memory Chips for Prevention of IC Counterfeiting

    Full text link
    This study presents a methodology for anticounterfeiting of Non-Volatile Memory (NVM) chips. In particular, we experimentally demonstrate a generalized methodology for detecting (i) Integrated Circuit (IC) origin, (ii) recycled or used NVM chips, and (iii) identification of used locations (addresses) in the chip. Our proposed methodology inspects latency and variability signatures of Commercial-Off-The-Shelf (COTS) NVM chips. The proposed technique requires low-cycle (~100) pre-conditioning and utilizes Machine Learning (ML) algorithms. We observe different trends in evolution of latency (sector erase or page write) with cycling on different NVM technologies from different vendors. ML assisted approach is utilized for detecting IC manufacturers with 95.1 % accuracy obtained on prepared test dataset consisting of 3 different NVM technologies including 6 different manufacturers (9 types of chips).Comment: 5 pages, 5 figures, accepted in IEEE NANO 202

    Stochastic Memory Devices for Security and Computing

    Get PDF
    With the widespread use of mobile computing and internet of things, secured communication and chip authentication have become extremely important. Hardware-based security concepts generally provide the best performance in terms of a good standard of security, low power consumption, and large-area density. In these concepts, the stochastic properties of nanoscale devices, such as the physical and geometrical variations of the process, are harnessed for true random number generators (TRNGs) and physical unclonable functions (PUFs). Emerging memory devices, such as resistive-switching memory (RRAM), phase-change memory (PCM), and spin-transfer torque magnetic memory (STT-MRAM), rely on a unique combination of physical mechanisms for transport and switching, thus appear to be an ideal source of entropy for TRNGs and PUFs. An overview of stochastic phenomena in memory devices and their use for developing security and computing primitives is provided. First, a broad classification of methods to generate true random numbers via the stochastic properties of nanoscale devices is presented. Then, practical implementations of stochastic TRNGs, such as hardware security and stochastic computing, are shown. Finally, future challenges to stochastic memory development are discussed
    corecore