15 research outputs found

    Implementation of Olfactory Bulb Glomerular-Layer Computations in a Digital Neurosynaptic Core

    Get PDF
    We present a biomimetic system that captures essential functional properties of the glomerular layer of the mammalian olfactory bulb, specifically including its capacity to decorrelate similar odor representations without foreknowledge of the statistical distributions of analyte features. Our system is based on a digital neuromorphic chip consisting of 256 leaky-integrate-and-fire neurons, 1024 × 256 crossbar synapses, and address-event representation communication circuits. The neural circuits configured in the chip reflect established connections among mitral cells, periglomerular cells, external tufted cells, and superficial short-axon cells within the olfactory bulb, and accept input from convergent sets of sensors configured as olfactory sensory neurons. This configuration generates functional transformations comparable to those observed in the glomerular layer of the mammalian olfactory bulb. Our circuits, consuming only 45 pJ of active power per spike with a power supply of 0.85 V, can be used as the first stage of processing in low-power artificial chemical sensing devices inspired by natural olfactory systems

    THE ERA OF NEUROSYNAPTICS: NEUROMORPHIC CHIPS AND ARCHITECTURE

    Get PDF
    Since its invention the modern day computer has shown a significant improvement in its performance and storage capacity.However, most of the current processor cores remain sequential in nature which limit the speed of computation. IBM has been consistently working over this and with the launching of neurosynaptic chips, it has opened a new gateway of thought process. This paper aims at reviewing the various stages and researches that have been instrumental in the overall development of neuromorphic architecture which aims at developing flexible brain like structure capable of performing wide range of real time computations while keeping ultra-low power consumption and size factor in mind. Inspired by the human brain, which is capable of performing complex tasks rapidly and accurately without being programmed and utilizing very less energy, TrueNorth chips tends to mimic the human brain so as to perform complex computations at a faster pace. This has inspired a new field of study aimed at development of the cognitive computing systems that could potentially emulate the brain's computing efficiency, size and power.The paper also aims to highlight the inadvertent challenges of neuromorphic architecture as posed by the prevailing technologies which are a major field of research in near future

    A 22-pJ/spike 73-Mspikes/s 130k-compartment neural array transceiver with conductance-based synaptic and membrane dynamics

    Get PDF
    Neuromorphic cognitive computing offers a bio-inspired means to approach the natural intelligence of biological neural systems in silicon integrated circuits. Typically, such circuits either reproduce biophysical neuronal dynamics in great detail as tools for computational neuroscience, or abstract away the biology by simplifying the functional forms of neural computation in large-scale systems for machine intelligence with high integration density and energy efficiency. Here we report a hybrid which offers biophysical realism in the emulation of multi-compartmental neuronal network dynamics at very large scale with high implementation efficiency, and yet with high flexibility in configuring the functional form and the network topology. The integrate-and-fire array transceiver (IFAT) chip emulates the continuous-time analog membrane dynamics of 65 k two-compartment neurons with conductance-based synapses. Fired action potentials are registered as address-event encoded output spikes, while the four types of synapses coupling to each neuron are activated by address-event decoded input spikes for fully reconfigurable synaptic connectivity, facilitating virtual wiring as implemented by routing address-event spikes externally through synaptic routing table. Peak conductance strength of synapse activation specified by the address-event input spans three decades of dynamic range, digitally controlled by pulse width and amplitude modulation (PWAM) of the drive voltage activating the log-domain linear synapse circuit. Two nested levels of micro-pipelining in the IFAT architecture improve both throughput and efficiency of synaptic input. This two-tier micro-pipelining results in a measured sustained peak throughput of 73 Mspikes/s and overall chip-level energy efficiency of 22 pJ/spike. Non-uniformity in digitally encoded synapse strength due to analog mismatch is mitigated through single-point digital offset calibration. Combined with the flexibly layered and recurrent synaptic connectivity provided by hierarchical address-event routing of registered spike events through external memory, the IFAT lends itself to efficient large-scale emulation of general biophysical spiking neural networks, as well as rate-based mapping of rectified linear unit (ReLU) neural activations

    A Multicore Neuromorphic Chip Design Using Multilevel Synapses in 65nm Standard CMOS Technology

    Get PDF
    University of Minnesota M.S. thesis. June 2015. Major: Electrical Engineering. Advisor: Chris Kim. 1 computer file (PDF); v, 38 pages.Neuromorphic research community is focused on designing a hardware which is as efficient as biological brain in terms of performance, power and area. It opens up opportunities to optimize these designs at all levels from architecture to devices. We propose a novel architecture to have tight integration between neurons and synapses. Our 32K bit neuromorphic chip with 256 axons and 256 neurons demonstrates 4 neuromorphic cores operating in a completely parallel fashion. Eflash memory core representing synapses saves power and area. The Non-volatility of eflash consumes zero static power. The ability to store multi-levels of weights in a single cell makes the array denser. Unlike flash technology, eflash doesn’t require specialized fabrication process, hence the neuromorphic chip is implemented in 65nm standard CMOS technology. The current sensing neurons with parallel reading scheme makes the neuronal operation several orders of magnitude faster than state-of-the-art neuromorphic designs

    Null convention logic circuits for asynchronous computer architecture

    Get PDF
    For most of its history, computer architecture has been able to benefit from a rapid scaling in semiconductor technology, resulting in continuous improvements to CPU design. During that period, synchronous logic has dominated because of its inherent ease of design and abundant tools. However, with the scaling of semiconductor processes into deep sub-micron and then to nano-scale dimensions, computer architecture is hitting a number of roadblocks such as high power and increased process variability. Asynchronous techniques can potentially offer many advantages compared to conventional synchronous design, including average case vs. worse case performance, robustness in the face of process and operating point variability and the ready availability of high performance, fine grained pipeline architectures. Of the many alternative approaches to asynchronous design, Null Convention Logic (NCL) has the advantage that its quasi delay-insensitive behavior makes it relatively easy to set up complex circuits without the need for exhaustive timing analysis. This thesis examines the characteristics of an NCL based asynchronous RISC-V CPU and analyses the problems with applying NCL to CPU design. While a number of university and industry groups have previously developed small 8-bit microprocessor architectures using NCL techniques, it is still unclear whether these offer any real advantages over conventional synchronous design. A key objective of this work has been to analyse the impact of larger word widths and more complex architectures on NCL CPU implementations. The research commenced by re-evaluating existing techniques for implementing NCL on programmable devices such as FPGAs. The little work that has been undertaken previously on FPGA implementations of asynchronous logic has been inconclusive and seems to indicate that asynchronous systems cannot be easily implemented in these devices. However, most of this work related to an alternative technique called bundled data, which is not well suited to FPGA implementation because of the difficulty in controlling and matching delays in a 'bundle' of signals. On the other hand, this thesis clearly shows that such applications are not only possible with NCL, but there are some distinct advantages in being able to prototype complex asynchronous systems in a field-programmable technology such as the FPGA. A large part of the value of NCL derives from its architectural level behavior, inherent pipelining, and optimization opportunities such as the merging of register and combina- tional logic functions. In this work, a number of NCL multiplier architectures have been analyzed to reveal the performance trade-offs between various non-pipelined, 1D and 2D organizations. Two-dimensional pipelining can easily be applied to regular architectures such as array multipliers in a way that is both high performance and area-efficient. It was found that the performance of 2D pipelining for small networks such as multipliers is around 260% faster than the equivalent non-pipelined design. However, the design uses 265% more transistors so the methodology is mainly of benefit where performance is strongly favored over area. A pipelined 32bit x 32bit signed Baugh-Wooley multiplier with Wallace-Tree Carry Save Adders (CSA), which is representative of a real design used for CPUs and DSPs, was used to further explore this concept as it is faster and has fewer pipeline stages compared to the normal array multiplier using Ripple-Carry adders (RCA). It was found that 1D pipelining with ripple-carry chains is an efficient implementation option but becomes less so for larger multipliers, due to the completion logic for which the delay time depends largely on the number of bits involved in the completion network. The average-case performance of ripple-carry adders was explored using random input vectors and it was observed that it offers little advantage on the smaller multiplier blocks, but this particular timing characteristic of asynchronous design styles be- comes increasingly more important as word size grows. Finally, this research has resulted in the development of the first 32-Bit asynchronous RISC-V CPU core. Called the Redback RISC, the architecture is a structure of pipeline rings composed of computational oscillations linked with flow completeness relationships. It has been written using NELL, a commercial description/synthesis tool that outputs standard Verilog. The Redback has been analysed and compared to two approximately equivalent industry standard 32-Bit synchronous RISC-V cores (PicoRV32 and Rocket) that are already fabricated and used in industry. While the NCL implementation is larger than both commercial cores it has similar performance and lower power compared to the PicoRV32. The implementation results were also compared against an existing NCL design tool flow (UNCLE), which showed how much the results of these implementation strategies differ. The Redback RISC has achieved similar level of throughput and 43% better power and 34% better energy compared to one of the synchronous cores with the same benchmark test and test condition such as input sup- ply voltage. However, it was shown that area is the biggest drawback for NCL CPU design. The core is roughly 2.5× larger than synchronous designs. On the other hand its area is still 2.9× smaller than previous designs using UNCLE tools. The area penalty is largely due to the unavoidable translation into a dual-rail topology when using the standard NCL cell library

    Energy Efficient and Error Resilient Neuromorphic Computing in VLSI

    Get PDF
    Realization of the conventional Von Neumann architecture faces increasing challenges due to growing process variations, device reliability and power consumption. As an appealing architectural solution, brain-inspired neuromorphic computing has drawn a great deal of research interest due to its potential improved scalability and power efficiency, and better suitability in processing complex tasks. Moreover, inherit error resilience in neuromorphic computing allows remarkable power and energy savings by exploiting approximate computing. This dissertation focuses on a scalable and energy efficient neurocomputing architecture which leverages emerging memristor nanodevices and a novel approximate arithmetic for cognitive computing. First, brain-inspired digital neuromorphic processor (DNP) architecture with memristive synaptic crossbar is presented for large scale spiking neural networks. We leverage memristor nanodevices to build an N ×N crossbar array to store not only multibit synaptic weight values but also the network configuration data with significantly reduced area cost. Additionally, the crossbar array is accessible both column- and row-wise to significantly expedite the synaptic weight update process for on-chip learning. The proposed digital pulse width modulator (PWM) readily creates a binary pulse with various durations to read and write the multilevel memristors with low cost. Our design integrates N digital leaky integrate-and-fire (LIF) silicon neurons to mimic their biological counterparts and the respective on-chip learning circuits for implementing spike timing dependent plasticity (STDP) learning rules. The proposed column based analog-to-digital conversion (ADC) scheme accumulates the pre-synaptic weights of a neuron efficiently and reduces silicon area by using only one shared arithmetic unit for processing LIF operations of all N neurons. With 256 silicon neurons, the learning circuits and 64K synapses, the power dissipation and area of our design are evaluated as 6.45 mW and 1.86 mm2, respectively, in a 90 nm CMOS technology. Furthermore, arithmetic computations contribute significantly to the overall processing time and power of the proposed architecture. In particular, addition and comparison operations represent 88.5% and 42.9% of processing time and power for digital LIF computation, respectively. Hence, by exploiting the built-in resilience of the presented neuromorphic architecture, we propose novel approximate adder and comparator designs to significantly reduce energy consumption with a very low er- ror rate. The significantly improved error rate and critical path delay stem from a novel carry prediction technique that leverages the information from less significant input bits in a parallel manner. An error magnitude reduction scheme is proposed to further reduce amount of error once detected with low cost in the proposed adder design. Implemented in a commercial 90 nm CMOS process, it is shown that the proposed adder is up to 2.4× faster and 43% more energy efficient over traditional adders while having an error rate of only 0.18%. Additionally, the proposed com- parator achieves an error rate of less than 0.1% and an energy reduction of up to 4.9× compared to the conventional ones. The proposed arithmetic has been adopted in a VLSI-based neuromorphic character recognition chip using unsupervised learning. The approximation errors of the proposed arithmetic units have been shown to have negligible impacts on the training process. Moreover, the energy saving of up to 66.5% over traditional arithmetic units is achieved for the neuromorphic chip with scaled supply levels
    corecore