647 research outputs found

    Novel arithmetic implementations using cellular neural network arrays.

    Get PDF
    The primary goal of this research is to explore the use of arrays of analog self-synchronized cells---the cellular neural network (CNN) paradigm---in the implementation of novel digital arithmetic architectures. In exploring this paradigm we also discover that the implementation of these CNN arrays produces very low system noise; that is, noise generated by the rapid switching of current through power supply die connections---so called di/dt noise. With the migration to sub 100 nanometer process technology, signal integrity is becoming a critical issue when integrating analog and digital components onto the same chip, and so the CNN architectural paradigm offers a potential solution to this problem. A typical example is the replacement of conventional digital circuitry adjacent to sensitive bio-sensors in a SoC Bio-Platform. The focus of this research is therefore to discover novel approaches to building low-noise digital arithmetic circuits using analog cellular neural networks, essentially implementing asynchronous digital logic but with the same circuit components as used in analog circuit design. We address our exploration by first improving upon previous research into CNN binary arithmetic arrays. The second phase of our research introduces a logical extension of the binary arithmetic method to implement binary signed-digit (BSD) arithmetic. To this end, a new class of CNNs that has three stable states is introduced, and is used to implement arithmetic circuits that use binary inputs and outputs but internally uses the BSD number representation. Finally, we develop CNN arrays for a 2-dimensional number representation (the Double-base Number System - DBNS). A novel adder architecture is described in detail, that performs the addition as well as reducing the representation for further processing; the design incorporates an innovative self-programmable array. Extensive simulations have shown that our new architectures can reduce system noise by almost 70dB and crosstalk by more than 23dB over standard digital implementations.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .I27. Source: Dissertation Abstracts International, Volume: 66-11, Section: B, page: 6159. Thesis (Ph.D.)--University of Windsor (Canada), 2005

    Computer arithmetic based on the Continuous Valued Number System

    Get PDF

    Overcoming Noise and Variations In Low-Precision Neural Networks

    Get PDF
    This work explores the impact of various design and training choices on the resilience of a neural network when subjected to noise and/or device variations. Simulations were performed under the expectation that the neural network would be implemented on analog hardware; this context asserts that there will be random noise within the circuit as well as variations in device characteristics between each fabricated device. The results show how noise can be added during the training process to reduce the impact of post-training noise. Architectural choices for the neural network also directly impact the performance variation between devices. The simulated neural networks were more robust to noise with a minimal architecture with fewer layers; if more neurons are needed for better fitting, networks with more neurons in shallow layers and fewer in deeper layers closer to the output tend to perform better. The paper also demonstrates that activation functions with lower slopes do a better job of suppressing noise in the neural network. It also shown that the accuracy can be made more consistent by introducing sparsity into the neural network. To that end, an evaluation is included of different methods for generating sparse architectures for smaller neural networks. A new method is proposed that consistently outperforms the most common methods used in larger, deeper networks.Ph.D

    Bio-inspired learning and hardware acceleration with emerging memories

    Get PDF
    Machine Learning has permeated many aspects of engineering, ranging from the Internet of Things (IoT) applications to big data analytics. While computing resources available to implement these algorithms have become more powerful, both in terms of the complexity of problems that can be solved and the overall computing speed, the huge energy costs involved remains a significant challenge. The human brain, which has evolved over millions of years, is widely accepted as the most efficient control and cognitive processing platform. Neuro-biological studies have established that information processing in the human brain relies on impulse like signals emitted by neurons called action potentials. Motivated by these facts, the Spiking Neural Networks (SNNs), which are a bio-plausible version of neural networks have been proposed as an alternative computing paradigm where the timing of spikes generated by artificial neurons is central to its learning and inference capabilities. This dissertation demonstrates the computational power of the SNNs using conventional CMOS and emerging nanoscale hardware platforms. The first half of this dissertation presents an SNN architecture which is trained using a supervised spike-based learning algorithm for the handwritten digit classification problem. This network achieves an accuracy of 98.17% on the MNIST test data-set, with about 4X fewer parameters compared to the state-of-the-art neural networks achieving over 99% accuracy. In addition, a scheme for parallelizing and speeding up the SNN simulation on a GPU platform is presented. The second half of this dissertation presents an optimal hardware design for accelerating SNN inference and training with SRAM (Static Random Access Memory) and nanoscale non-volatile memory (NVM) crossbar arrays. Three prominent NVM devices are studied for realizing hardware accelerators for SNNs: Phase Change Memory (PCM), Spin Transfer Torque RAM (STT-RAM) and Resistive RAM (RRAM). The analysis shows that a spike-based inference engine with crossbar arrays of STT-RAM bit-cells is 2X and 5X more efficient compared to PCM and RRAM memories, respectively. Furthermore, the STT-RAM design has nearly 6X higher throughput per unit Watt per unit area than that of an equivalent SRAM-based (Static Random Access Memory) design. A hardware accelerator with on-chip learning on an STT-RAM memory array is also designed, requiring 1616 bits of floating-point synaptic weight precision to reach the baseline SNN algorithmic performance on the MNIST dataset. The complete design with STT-RAM crossbar array achieves nearly 20X higher throughput per unit Watt per unit mm^2 than an equivalent design with SRAM memory. In summary, this work demonstrates the potential of spike-based neuromorphic computing algorithms and its efficient realization in hardware based on conventional CMOS as well as emerging technologies. The schemes presented here can be further extended to design spike-based systems that can be ubiquitously deployed for energy and memory constrained edge computing applications

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    Neuromodulation of Spatial Associations: Evidence from Choice Reaction Tasks During Transcranial Direct Current Stimulation

    Get PDF
    Various portions of human behavior and cognition are influenced by covert implicit processes without being necessarily available to intentional planning. Implicit cognitive biases can be measured in behavioral tasks yielding SNARC effects for spatial associations of numerical and non-numerical sequences, or yielding the implicit association test effect for associations between insect-flower and negative-positive categories. By using concurrent neuromodulation with transcranial direct current stimulation (tDCS), subthreshold activity patterns in prefrontal cortical regions can be experimentally manipulated to reduce implicit processing. Thus, the application of tDCS can test neurocognitive hypotheses on a unique neurocognitive origin of implicit cognitive biases in different spatial-numerical and non-numerical domains. However, the effects of tDCS are not only determined by superimposed electric fields, but also by task characteristics. To outline the possibilities of task-specific targeting of tDCS, task characteristics and instructions can be varied systematically when combined with neuromodulation. In the present thesis, implicit cognitive processes are assessed in different paradigms concurrent to left-hemispheric prefrontal tDCS to investigate a verbal processing hypothesis for implicit associations in general. In psychological experiments, simple choice reaction tasks measure implicit SNARC and SNARC-like effects as relative left-hand vs. right-hand latency advantages for responding to smaller number or ordinal sequence targets. However, different combinations of polarity-dependent tDCS with stimuli and task procedures also reveal domain-specific involvements and dissociations. Discounting previous unified theories on the SNARC effect, polarity-specific neuromodulation effects dissociate numbers and weekday or month ordinal sequences. By considering also previous results and patient studies, I present a hybrid and augmented working memory account and elaborate the linguistic markedness correspondence principle as one critical verbal mechanism among competing covert coding mechanisms. Finally, a general stimulation rationale based on verbal working memory is tested in separate experiments extending also to non-spatial implicit association test effects. Regarding cognitive tDCS effects, the present studies show polarity asymmetry and task-induced activity dependence of state-dependent neuromodulation. At large, distinct combinations of the identical tDCS electrode configuration with different tasks influences behavioral outcomes tremendously, which will allow for improved task- and domain-specific targeting

    Neuromorphic computing using non-volatile memory

    Get PDF
    Dense crossbar arrays of non-volatile memory (NVM) devices represent one possible path for implementing massively-parallel and highly energy-efficient neuromorphic computing systems. We first review recent advances in the application of NVM devices to three computing paradigms: spiking neural networks (SNNs), deep neural networks (DNNs), and ‘Memcomputing’. In SNNs, NVM synaptic connections are updated by a local learning rule such as spike-timing-dependent-plasticity, a computational approach directly inspired by biology. For DNNs, NVM arrays can represent matrices of synaptic weights, implementing the matrix–vector multiplication needed for algorithms such as backpropagation in an analog yet massively-parallel fashion. This approach could provide significant improvements in power and speed compared to GPU-based DNN training, for applications of commercial significance. We then survey recent research in which different types of NVM devices – including phase change memory, conductive-bridging RAM, filamentary and non-filamentary RRAM, and other NVMs – have been proposed, either as a synapse or as a neuron, for use within a neuromorphic computing application. The relevant virtues and limitations of these devices are assessed, in terms of properties such as conductance dynamic range, (non)linearity and (a)symmetry of conductance response, retention, endurance, required switching power, and device variability.11Yscopu

    A Scalable Flash-Based Hardware Architecture for the Hierarchical Temporal Memory Spatial Pooler

    Get PDF
    Hierarchical temporal memory (HTM) is a biomimetic machine learning algorithm focused upon modeling the structural and algorithmic properties of the neocortex. It is comprised of two components, realizing pattern recognition of spatial and temporal data, respectively. HTM research has gained momentum in recent years, leading to both hardware and software exploration of its algorithmic formulation. Previous work on HTM has centered on addressing performance concerns; however, the memory-bound operation of HTM presents significant challenges to scalability. In this work, a scalable flash-based storage processor unit, Flash-HTM (FHTM), is presented along with a detailed analysis of its potential scalability. FHTM leverages SSD flash technology to implement the HTM cortical learning algorithm spatial pooler. The ability for FHTM to scale with increasing model complexity is addressed with respect to design footprint, memory organization, and power efficiency. Additionally, a mathematical model of the hardware is evaluated against the MNIST dataset, yielding 91.98% classification accuracy. A fully custom layout is developed to validate the design in a TSMC 180nm process. The area and power footprints of the spatial pooler are 30.538mm2 and 5.171mW, respectively. Storage processor units have the potential to be viable platforms to support implementations of HTM at scale

    Hardware Implementations of Spiking Neural Networks and Artificially Intelligent Systems

    Get PDF
    Artificial spiking neural networks are gaining increasing prominence due to their potential advantages over traditional, time-static artificial neural networks. Custom hardware implementations of spiking neural networks present many advantages over other implementation mediums. Two main topics are the focus of this work. Firstly, digital hardware implementations of spiking neurons and neuromorphic hardware are explored and presented. These implementations include novel implementations for lowered digital hardware requirements and reduced power consumption. The second section of this work proposes a novel method for selectively adding sparsity to a spiking neural network based on training set images for pattern recognition applications, thereby greatly reducing the inference time required in a digital hardware implementation
    corecore