Search CORE

45 research outputs found

Intégration CMOS analogique de réseaux de neurones à cliques

Author: Larras Benoit
Publication venue: HAL CCSD
Publication date: 03/12/2015
Field of study

Artificial neural networks solve problems that classical processors cannot solve without using a huge amount of resources. For instance, multiple-signal analysis and classification are such problems. Moreover, artificial neural networks are more and more integrated on-chip. They aim therefore at increasing processors computational abilities or processing data in embedded systems. In embedded systems, circuit area and energy consumption are critical parameters. However, the amount of connections between neurons is very high. Besides, circuit integration is difficult due to weighted connections and complex activation functions. These limitations exist for most artificial neural networks models and are thus an issue for the integration of a neural network composed of a high number of neurons (hundreds of them or more). Clique-based neural networks are a model of artificial neural networks reducing the network density, in terms of connections between neurons. Its information storage capacity is moreover greater than that of a standard artificial neural networks model such as Hopfield neural networks. This model is therefore suited to implement a high number of neurons on chip, leading to low-complexity and low-energy consumption circuits. In this document, we introduce a mixed-signal circuit implementing clique-based neural networks. We also show several generic network architectures implementing a network of any number of neurons. We can therefore implement clique-based neural networks of up to thousands of neurons consuming little energy. In order to validate the proposed implementation, we have fabricated a 30-neuron clique-based neural network prototype integrated on chip for the Si 65-nm CMOS 1-V supply process. The circuit shows decoding performances similar to the theoretical model and executes the message recovery process in 58 ns. Moreover, the entire network occupies a silicon area of 16,470 µm² and consumes 145 µW, yielding a measured energy consumption per neuron of 423 fJ maximum. These results show that the fabricated circuit is ten times more efficient in terms of occupied silicon area and latency than a digital equivalent circuit.Les réseaux de neurones artificiels permettent de résoudre des problèmes que des processeurs classiques ne peuvent pas résoudre sans utiliser une quantité considérable de ressources matérielles. L'analyse et la classification de multiples signaux en sont des exemples. Ces réseaux sont de plus en plus implantés sur des circuits intégrés. Ils ont ainsi pour but d'augmenter les capacités de calcul de processeurs ou d'effectuer leur traitement dans des systèmes embarqués. Dans un contexte d'application embarquée, la surface et la consommation d'énergie du circuit sont prépondérantes. Cependant, le nombre de connexions entre les neurones est élevé. De plus, les poids synaptiques ainsi que les fonctions d'activation utilisées rendent les implantations sur circuit complexes. Ces aspects, communs dans la plupart des modèles de réseaux de neurones, limitent l'intégration d'un réseau contenant un nombre de neurones de l'ordre de la centaine. Le modèle des réseaux de neurones à cliques permet de réduire la densité de connexions au sein d'un réseau, tout en gardant une capacité de stockage d'information plus grande que les réseaux de Hopfield, qui est un modèle standard de réseaux de neurones. Ce modèle est donc approprié pour implanter un réseau de grande taille, à condition de l'intégrer de façon à garder la faible complexité de ses fonctions, pour consommer un minimum d'énergie. Dans ce document, nous proposons un circuit mixte analogique/numérique implantant le modèle des réseaux de neurones à cliques. Nous proposons également plusieurs architectures de réseau pouvant contenir un nombre indéterminé de neurones. Cela nous permet de construire des réseaux de neurones à cliques contenant jusqu'à plusieurs milliers de neurones et consommant peu d'énergie. Pour valider les concepts décrits dans ce document, nous avons fabriqué et testé un prototype d'un réseau de neurones à cliques contenant trente neurones sur puce. Nous utilisons pour cela la technologie Si CMOS 65 nm, avec une tension d'alimentation de 1 V. Le circuit a des performances de récupération de l'information similaires à celles du modèle théorique, et effectue la récupération d'un message en 58 ns. Le réseau de neurones occupe une surface de silicium de 16 470 µm² et consomme 145 µW. Ces mesures attestent une consommation d'énergie par neurone de 423 fJ au maximum. Ces résultats montrent que le circuit produit est dix fois plus efficace qu'un équivalent numérique en termes de surface de silicium occupée et de latence

Thèses en Ligne

HAL-Université de Bretagne Occidentale

HAL Descartes

Hal-Diderot

Review on data-centric brain-inspired computing paradigms exploiting emerging memory devices

Author: Du Nan
Kvatinsky Shahar
Schmidt Heidemarie
Wang Wei
Publication venue
Publication date: 07/10/2022
Field of study

Biologically-inspired neuromorphic computing paradigms are computational platforms that imitate synaptic and neuronal activities in the human brain to process big data flows in an efficient and cognitive manner. In the past decades, neuromorphic computing has been widely investigated in various application fields such as language translation, image recognition, modeling of phase, and speech recognition, especially in neural networks (NNs) by utilizing emerging nanotechnologies; due to their inherent miniaturization with low power cost, they can alleviate the technical barriers of neuromorphic computing by exploiting traditional silicon technology in practical applications. In this work, we review recent advances in the development of brain-inspired computing (BIC) systems with respect to the perspective of a system designer, from the device technology level and circuit level up to the architecture and system levels. In particular, we sort out the NN architecture determined by the data structures centered on big data flows in application scenarios. Finally, the interactions between the system level with the architecture level and circuit/device level are discussed. Consequently, this review can serve the future development and opportunities of the BIC system design

Digitale Bibliothek Thüringen

Fiabilisation de convertisseurs analogique-numérique à modulation Sigma-Delta

Author: Cai Hao
Publication venue: HAL CCSD
Publication date: 09/09/2013
Field of study

This thesis concentrates on reliability-aware methodology development, reliability analysis based on simulation as well as failure prediction of CMOS 65nm analog and mixed signal (AMS) ICs. Sigma-Delta modulators are concerned as the object of reliability study at system level. A hierarchical statistical approach for reliability is proposed to analysis the performance of Sigma-Delta modulators under ageing effects and process variations. Statistical methods are combined into this analysis flow.Ce travail de thèse a porté sur des problèmes de fiabilité de circuits intégrés en technologie CMOS 65 nm, en particulier sur la conception en vue de la fiabilité, la simulation et l'amélioration de la fiabilité. Les mécanismes dominants de vieillissement HCI et NBTI ainsi que la variation du processus ont été étudiés et évalués quantitativement au niveau du circuit et au niveau du système. Ces méthodes ont été appliquées aux modulateurs Sigma-Delta afin de déterminer la fiabilité de ce type de composant qui est très utilisé

Thèses en Ligne

thèses en ligne de ParisTech

Fiabilisation de Convertisseurs Analogique-Num´erique a Modulation Sigma-Delta

Author: Cai Hao
Publication venue: HAL CCSD
Publication date: 09/09/2013
Field of study

Due to the continuously scaling down of CMOS technology, system-on-chips (SoCs) reliability becomes important in sub-90 nm CMOS node. Integrated circuits and systems applied to aerospace, avionic, vehicle transport and biomedicine are highly sensitive to reliability problems such as ageing mechanisms and parametric process variations. Novel SoCs with new materials and architectures of high complexity further aggravate reliability as a critical aspect of process integration. For instance, random and systematic defects as well as parametric process variations have a large influence on quality and yield of the manufactured ICs, right after production. During ICs usage time, time-dependent ageing mechanisms such as negative bias temperature instability (NBTI) and hot carrier injection (HCI) can significantly degrade ICs performance.La fiabilit´e des ICs est d´efinie ainsi : la capacit´e d’un circuit ou un syst`eme int´egr´e `amaintenir ses param`etres durant une p´eriode donn´ee sous des conditions d´efinies. Les rapportsITRS 2011 consid`ere la fiabilit´e comme un aspect critique du processus d’int´egration.Par cons´equent, il faut faire appel des m´ethodologies innovatrices prenant en comptela fiabilit´e afin d’assurer la fonctionnalit´e du SoCs et la fiabilit´e dans les technologiesCMOS `a l’´echelle nanom´etrique. Cela nous permettra de d´evelopper des m´ethodologiesind´ependantes du design et de la technologie CMOS, en revanche, sp´ecialis´ees en fiabilit´e

Thèses en Ligne

Intelligent Sensor Networks

Author
Publication venue: 'Informa UK Limited'
Publication date
Field of study

In the last decade, wireless or wired sensor networks have attracted much attention. However, most designs target general sensor network issues including protocol stack (routing, MAC, etc.) and security issues. This book focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on their world-class research, the authors present the fundamentals of intelligent sensor networks. They cover sensing and sampling, distributed signal processing, and intelligent signal learning. In addition, they present cutting-edge research results from leading experts

OAPEN Library

Understanding Quantum Technologies 2022

Author: Ezratty Olivier
Publication venue
Publication date: 27/10/2022
Field of study

Understanding Quantum Technologies 2022 is a creative-commons ebook that provides a unique 360 degrees overview of quantum technologies from science and technology to geopolitical and societal issues. It covers quantum physics history, quantum physics 101, gate-based quantum computing, quantum computing engineering (including quantum error corrections and quantum computing energetics), quantum computing hardware (all qubit types, including quantum annealing and quantum simulation paradigms, history, science, research, implementation and vendors), quantum enabling technologies (cryogenics, control electronics, photonics, components fabs, raw materials), quantum computing algorithms, software development tools and use cases, unconventional computing (potential alternatives to quantum and classical computing), quantum telecommunications and cryptography, quantum sensing, quantum technologies around the world, quantum technologies societal impact and even quantum fake sciences. The main audience are computer science engineers, developers and IT specialists as well as quantum scientists and students who want to acquire a global view of how quantum technologies work, and particularly quantum computing. This version is an extensive update to the 2021 edition published in October 2021.Comment: 1132 pages, 920 figures, Letter forma

arXiv.org e-Print Archive

Dynamical Systems in Spiking Neuromorphic Hardware

Author: Voelker Aaron Russell
Publication venue: 'University of Waterloo'
Publication date: 25/04/2019
Field of study

Dynamical systems are universal computers. They can perceive stimuli, remember, learn from feedback, plan sequences of actions, and coordinate complex behavioural responses. The Neural Engineering Framework (NEF) provides a general recipe to formulate models of such systems as coupled sets of nonlinear differential equations and compile them onto recurrently connected spiking neural networks – akin to a programming language for spiking models of computation. The Nengo software ecosystem supports the NEF and compiles such models onto neuromorphic hardware. In this thesis, we analyze the theory driving the success of the NEF, and expose several core principles underpinning its correctness, scalability, completeness, robustness, and extensibility. We also derive novel theoretical extensions to the framework that enable it to far more effectively leverage a wide variety of dynamics in digital hardware, and to exploit the device-level physics in analog hardware. At the same time, we propose a novel set of spiking algorithms that recruit an optimal nonlinear encoding of time, which we call the Delay Network (DN). Backpropagation across stacked layers of DNs dramatically outperforms stacked Long Short-Term Memory (LSTM) networks—a state-of-the-art deep recurrent architecture—in accuracy and training time, on a continuous-time memory task, and a chaotic time-series prediction benchmark. The basic component of this network is shown to function on state-of-the-art spiking neuromorphic hardware including Braindrop and Loihi. This implementation approaches the energy-efficiency of the human brain in the former case, and the precision of conventional computation in the latter case

University of Waterloo's Institutional Repository

Exploiting Natural On-chip Redundancy for Energy Efficient Memory and Computing

Author: Alastruey Benedé Jesús
Ferrerón Labari Alexandra
Suárez Gracia Darío
Publication venue: Universidad de Zaragoza, Prensas de la Universidad
Publication date: 01/01/2016
Field of study

Power density is currently the primary design constraint across most computing segments and the main performance limiting factor. For years, industry has kept power density constant, while increasing frequency, lowering transistors supply (Vdd) and threshold (Vth) voltages. However, Vth scaling has stopped because leakage current is exponentially related to it. Transistor count and integration density keep doubling every process generation (Moore’s Law), but the power budget caps the amount of hardware that can be active at the same time, leading to dark silicon. With each new generation, there are more resources available, but we cannot fully exploit their performance potential. In the last years, different research trends have explored how to cope with dark silicon and unlock the energy efficiency of the chips, including Near-Threshold voltage Computing (NTC) and approximate computing. NTC aggressively lowers Vdd to values near Vth. This allows a substantial reduction in power, as dynamic power scales quadratically with supply voltage. The resultant power reduction could be used to activate more chip resources and potentially achieve performance improvements. Unfortunately, Vdd scaling is limited by the tight functionality margins of on-chip SRAM transistors. When scaling Vdd down to values near-threshold, manufacture-induced parameter variations affect the functionality of SRAM cells, which eventually become not reliable. A large amount of emerging applications, on the other hand, features an intrinsic error-resilience property, tolerating a certain amount of noise. In this context, approximate computing takes advantage of this observation and exploits the gap between the level of accuracy required by the application and the level of accuracy given by the computation, providing that reducing the accuracy translates into an energy gain. However, deciding which instructions and data and which techniques are best suited for approximation still poses a major challenge. This dissertation contributes in these two directions. First, it proposes a new approach to mitigate the impact of SRAM failures due to parameter variation for effective operation at ultra-low voltages. We identify two levels of natural on-chip redundancy: cache level and content level. The first arises because of the replication of blocks in multi-level cache hierarchies. We exploit this redundancy with a cache management policy that allocates blocks to entries taking into account the nature of the cache entry and the use pattern of the block. This policy obtains performance improvements between 2% and 34%, with respect to block disabling, a technique with similar complexity, incurring no additional storage overhead. The latter (content level redundancy) arises because of the redundancy of data in real world applications. We exploit this redundancy compressing cache blocks to fit them in partially functional cache entries. At the cost of a slight overhead increase, we can obtain performance within 2% of that obtained when the cache is built with fault-free cells, even if more than 90% of the cache entries have at least a faulty cell. Then, we analyze how the intrinsic noise tolerance of emerging applications can be exploited to design an approximate Instruction Set Architecture (ISA). Exploiting the ISA redundancy, we explore a set of techniques to approximate the execution of instructions across a set of emerging applications, pointing out the potential of reducing the complexity of the ISA, and the trade-offs of the approach. In a proof-of-concept implementation, the ISA is shrunk in two dimensions: Breadth (i.e., simplifying instructions) and Depth (i.e., dropping instructions). This proof-of-concept shows that energy can be reduced on average 20.6% at around 14.9% accuracy loss

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Universidad de Zaragoza

DOC 2014-09 Proposal for MS in Computer Engineering (MSCPE)

Author: University of Dayton. School of Engineering
Publication venue: eCommons
Publication date: 01/01/2014
Field of study

Legislative authorit

University of Dayton