1,842 research outputs found

    An ART1 microchip and its use in multi-ART1 systems

    Get PDF
    Recently, a real-time clustering microchip neural engine based on the ART1 architecture has been reported. Such chip is able to cluster 100-b patterns into up to 18 categories at a speed of 1.8 ÎŒs per pattern. However, that chip rendered an extremely high silicon area consumption of 1 cm2, and consequently an extremely low yield of 6%. Redundant circuit techniques can be introduced to improve yield performance at the cost of further increasing chip size. In this paper we present an improved ART1 chip prototype based on a different approach to implement the most area consuming circuit elements of the first prototype: an array of several thousand current sources which have to match within a precision of around 1%. Such achievement was possible after a careful transistor mismatch characterization of the fabrication process (ES2-1.0 ÎŒm CMOS). A new prototype chip has been fabricated which can cluster 50-b input patterns into up to ten categories. The chip has 15 times less area, shows a yield performance of 98%, and presents the same precision and speed than the previous prototype. Due to its higher robustness multichip systems are easily assembled. As a demonstration we show results of a two-chip ART1 system, and of an ARTMAP system made of two ART1 chips and an extra interfacing chip

    Compact low-power calibration mini-DACs for neural arrays with programmable weights

    Get PDF
    This paper considers the viability of compact low-resolution low-power mini digital-to-analog converters (mini-DACs) for use in large arrays of neural type cells, where programmable weights are required. Transistors are biased in weak inversion in order to yield small currents and low power consumptions, a necessity when building large size arrays. One important drawback of weak inversion operation is poor matching between transistors. The resulting effective precision of a fabricated array of 50 DACs turned out to be 47% (1.1 bits), due to transistor mismatch. However, it is possible to combine them two by two in order to build calibrated DACs, thus compensating for inter-DAC mismatch. It is shown experimentally that the precision can be improved easily by a factor of 10 (4.8% or 4.4 bits), which makes these DACs viable for low-resolution applications such as massive arrays of neural processing circuits. A design methodology is provided, and illustrated through examples, to obtain calibrated mini-DACs of a given target precision. As an example application, we show simulation results of using this technique to calibrate an array of digitally controlled integrate-and-fire neurons.Gobierno de España TIC1999-0446-C02-02, TIC2000-0406-P4-05, FIT-07000/2002/921, TIC2002-10878-EEuropean Union IST- 2001-3412

    MorphIC: A 65-nm 738k-Synapse/mm2^2 Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning

    Full text link
    Recent trends in the field of neural network accelerators investigate weight quantization as a means to increase the resource- and power-efficiency of hardware devices. As full on-chip weight storage is necessary to avoid the high energy cost of off-chip memory accesses, memory reduction requirements for weight storage pushed toward the use of binary weights, which were demonstrated to have a limited accuracy reduction on many applications when quantization-aware training techniques are used. In parallel, spiking neural network (SNN) architectures are explored to further reduce power when processing sparse event-based data streams, while on-chip spike-based online learning appears as a key feature for applications constrained in power and resources during the training phase. However, designing power- and area-efficient spiking neural networks still requires the development of specific techniques in order to leverage on-chip online learning on binary weights without compromising the synapse density. In this work, we demonstrate MorphIC, a quad-core binary-weight digital neuromorphic processor embedding a stochastic version of the spike-driven synaptic plasticity (S-SDSP) learning rule and a hierarchical routing fabric for large-scale chip interconnection. The MorphIC SNN processor embeds a total of 2k leaky integrate-and-fire (LIF) neurons and more than two million plastic synapses for an active silicon area of 2.86mm2^2 in 65nm CMOS, achieving a high density of 738k synapses/mm2^2. MorphIC demonstrates an order-of-magnitude improvement in the area-accuracy tradeoff on the MNIST classification task compared to previously-proposed SNNs, while having no penalty in the energy-accuracy tradeoff.Comment: This document is the paper as accepted for publication in the IEEE Transactions on Biomedical Circuits and Systems journal (2019), the fully-edited paper is available at https://ieeexplore.ieee.org/document/876400

    An Analog VLSI Deep Machine Learning Implementation

    Get PDF
    Machine learning systems provide automated data processing and see a wide range of applications. Direct processing of raw high-dimensional data such as images and video by machine learning systems is impractical both due to prohibitive power consumption and the “curse of dimensionality,” which makes learning tasks exponentially more difficult as dimension increases. Deep machine learning (DML) mimics the hierarchical presentation of information in the human brain to achieve robust automated feature extraction, reducing the dimension of such data. However, the computational complexity of DML systems limits large-scale implementations in standard digital computers. Custom analog signal processing (ASP) can yield much higher energy efficiency than digital signal processing (DSP), presenting means of overcoming these limitations. The purpose of this work is to develop an analog implementation of DML system. First, an analog memory is proposed as an essential component of the learning systems. It uses the charge trapped on the floating gate to store analog value in a non-volatile way. The memory is compatible with standard digital CMOS process and allows random-accessible bi-directional updates without the need for on-chip charge pump or high voltage switch. Second, architecture and circuits are developed to realize an online k-means clustering algorithm in analog signal processing. It achieves automatic recognition of underlying data pattern and online extraction of data statistical parameters. This unsupervised learning system constitutes the computation node in the deep machine learning hierarchy. Third, a 3-layer, 7-node analog deep machine learning engine is designed featuring online unsupervised trainability and non-volatile floating-gate analog storage. It utilizes massively parallel reconfigurable current-mode analog architecture to realize efficient computation. And algorithm-level feedback is leveraged to provide robustness to circuit imperfections in analog signal processing. At a processing speed of 8300 input vectors per second, it achieves 1×1012 operation per second per Watt of peak energy efficiency. In addition, an ultra-low-power tunable bump circuit is presented to provide similarity measures in analog signal processing. It incorporates a novel wide-input-range tunable pseudo-differential transconductor. The circuit demonstrates tunability of bump center, width and height with a power consumption significantly lower than previous works

    A spatial contrast retina with on-chip calibration for neuromorphic spike-based AER vision systems

    Get PDF
    We present a 32 32 pixels contrast retina microchip that provides its output as an address event representation (AER) stream. Spatial contrast is computed as the ratio between pixel photocurrent and a local average between neighboring pixels obtained with a diffuser network. This current-based computation produces an important amount of mismatch between neighboring pixels, because the currents can be as low as a few pico-amperes. Consequently, a compact calibration circuitry has been included to trimm each pixel. Measurements show a reduction in mismatch standard deviation from 57% to 6.6% (indoor light). The paper describes the design of the pixel with its spatial contrast computation and calibration sections. About one third of pixel area is used for a 5-bit calibration circuit. Area of pixel is 58 m 56 m, while its current consumption is about 20 nA at 1-kHz event rate. Extensive experimental results are provided for a prototype fabricated in a standard 0.35- m CMOS process.Gobierno de España TIC2003-08164-C03-01, TEC2006-11730-C03-01European Union IST-2001-3412

    A multi-view approach to cDNA micro-array analysis

    Get PDF
    The official published version can be obtained from the link below.Microarray has emerged as a powerful technology that enables biologists to study thousands of genes simultaneously, therefore, to obtain a better understanding of the gene interaction and regulation mechanisms. This paper is concerned with improving the processes involved in the analysis of microarray image data. The main focus is to clarify an image's feature space in an unsupervised manner. In this paper, the Image Transformation Engine (ITE), combined with different filters, is investigated. The proposed methods are applied to a set of real-world cDNA images. The MatCNN toolbox is used during the segmentation process. Quantitative comparisons between different filters are carried out. It is shown that the CLD filter is the best one to be applied with the ITE.This work was supported in part by the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grant GR/S27658/01, the National Science Foundation of China under Innovative Grant 70621001, Chinese Academy of Sciences under Innovative Group Overseas Partnership Grant, the BHP Billiton Cooperation of Australia Grant, the International Science and Technology Cooperation Project of China under Grant 2009DFA32050 and the Alexander von Humboldt Foundation of Germany
    • 

    corecore