25 research outputs found

    Embedding Logic and Non-volatile Devices in CMOS Digital Circuits for Improving Energy Efficiency

    Get PDF
    abstract: Static CMOS logic has remained the dominant design style of digital systems for more than four decades due to its robustness and near zero standby current. Static CMOS logic circuits consist of a network of combinational logic cells and clocked sequential elements, such as latches and flip-flops that are used for sequencing computations over time. The majority of the digital design techniques to reduce power, area, and leakage over the past four decades have focused almost entirely on optimizing the combinational logic. This work explores alternate architectures for the flip-flops for improving the overall circuit performance, power and area. It consists of three main sections. First, is the design of a multi-input configurable flip-flop structure with embedded logic. A conventional D-type flip-flop may be viewed as realizing an identity function, in which the output is simply the value of the input sampled at the clock edge. In contrast, the proposed multi-input flip-flop, named PNAND, can be configured to realize one of a family of Boolean functions called threshold functions. In essence, the PNAND is a circuit implementation of the well-known binary perceptron. Unlike other reconfigurable circuits, a PNAND can be configured by simply changing the assignment of signals to its inputs. Using a standard cell library of such gates, a technology mapping algorithm can be applied to transform a given netlist into one with an optimal mixture of conventional logic gates and threshold gates. This approach was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier in 65nm LP technology. Simulation and chip measurements show more than 30% improvement in dynamic power and more than 20% reduction in core area. The functional yield of the PNAND reduces with geometry and voltage scaling. The second part of this research investigates the use of two mechanisms to improve the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM devices for low voltage operation. The third part of this research focused on the design of flip-flops with non-volatile storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated with both conventional D-flipflop and the PNAND circuits to implement non-volatile logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of system locally when a power interruption occurs. However, manufacturing variations in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading to an overly pessimistic design and consequently, higher energy consumption. A detailed analysis of the design trade-offs in the driver circuitry for performing backup and restore, and a novel method to design the energy optimal driver for a given yield is presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented, in which the backup time is determined on a per-chip basis, resulting in minimizing the energy wastage and satisfying the yield constraint. To achieve a yield of 98%, the conventional approach would have to expend nearly 5X more energy than the minimum required, whereas the proposed tunable approach expends only 26% more energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are designed with the same backup and restore circuitry in 65nm technology. The embedded logic in NV-TLFF compensates performance overhead of NVL. This leads to the possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and- accumulate (MAC) unit is designed to demonstrate the performance benefits of the proposed architecture. Based on the results of HSPICE simulations, the MAC circuit with the proposed NV-TLFF cells is shown to consume at least 20% less power and area as compared to the circuit designed with conventional DFFs, without sacrificing any performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Nouvelles Architectures Hybrides (Logique / Mémoires Non-Volatiles et technologies associées.)

    Get PDF
    Les nouvelles approches de technologies mémoires permettront une intégration dite back-end, où les cellules élémentaires de stockage seront fabriquées lors des dernières étapes de réalisation à grande échelle du circuit. Ces approches innovantes sont souvent basées sur l'utilisation de matériaux actifs présentant deux états de résistance distincts. Le passage d'un état à l'autre est contrôlé en courant ou en tension donnant lieu à une caractéristique I-V hystérétique. Nos mémoires résistives sont composées d'argent en métal électrochimiquement actif et de sulfure amorphe agissant comme électrolyte. Leur fonctionnement repose sur la formation réversible et la dissolution d'un filament conducteur. Le potentiel d'application de ces nouveaux dispositifs n'est pas limité aux mémoires ultra-haute densité mais aussi aux circuits embarqués. En empilant ces mémoires dans la troisième dimension au niveau des interconnections des circuits logiques CMOS, de nouvelles architectures hybrides et innovantes deviennent possibles. Il serait alors envisageable d'exploiter un fonctionnement à basse énergie, à haute vitesse d'écriture/lecture et de haute performance telles que l'endurance et la rétention. Dans cette thèse, en se concentrant sur les aspects de la technologie de mémoire en vue de développer de nouvelles architectures, l'introduction d'une fonctionnalité non-volatile au niveau logique est démontrée par trois circuits hybrides: commutateurs de routage non volatiles dans un Field Programmable Gate Arrays, un 6T-SRAM non volatile, et les neurones stochastiques pour un réseau neuronal. Pour améliorer les solutions existantes, les limitations de la performances des dispositifs mémoires sont identifiés et résolus avec des nouveaux empilements ou en fournissant des défauts de circuits tolérants.Novel approaches in the field of memory technology should enable backend integration, where individual storage nodes will be fabricated during the last fabrication steps of the VLSI circuit. In this case, memory operation is often based upon the use of active materials with resistive switching properties. A topology of resistive memory consists of silver as electrochemically active metal and amorphous sulfide acting as electrolyte and relies on the reversible formation and dissolution of a conductive filament. The application potential of these new memories is not limited to stand-alone (ultra-high density), but is also suitable for embedded applications. By stacking these memories in the third dimension at the interconnection level of CMOS logic, new ultra-scalable hybrid architectures becomes possible which exploit low energy operation, fast write/read access and high performance with respect to endurance and retention. In this thesis, focusing on memory technology aspects in view of developing new architectures, the introduction of non-volatile functionality at the logic level is demonstrated through three hybrid (CMOS logic ReRAM devices) circuits: nonvolatile routing switches in a Field Programmable Gate Array, nonvolatile 6T-SRAMs, and stochastic neurons of an hardware neural network. To be competitive or even improve existing solutions, limitations on the memory devices performances are identified and solved by stack engineering of CBRAM devices or providing faults tolerant circuits.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Flash Memory Devices

    Get PDF
    Flash memory devices have represented a breakthrough in storage since their inception in the mid-1980s, and innovation is still ongoing. The peculiarity of such technology is an inherent flexibility in terms of performance and integration density according to the architecture devised for integration. The NOR Flash technology is still the workhorse of many code storage applications in the embedded world, ranging from microcontrollers for automotive environment to IoT smart devices. Their usage is also forecasted to be fundamental in emerging AI edge scenario. On the contrary, when massive data storage is required, NAND Flash memories are necessary to have in a system. You can find NAND Flash in USB sticks, cards, but most of all in Solid-State Drives (SSDs). Since SSDs are extremely demanding in terms of storage capacity, they fueled a new wave of innovation, namely the 3D architecture. Today “3D” means that multiple layers of memory cells are manufactured within the same piece of silicon, easily reaching a terabit capacity. So far, Flash architectures have always been based on "floating gate," where the information is stored by injecting electrons in a piece of polysilicon surrounded by oxide. On the contrary, emerging concepts are based on "charge trap" cells. In summary, flash memory devices represent the largest landscape of storage devices, and we expect more advancements in the coming years. This will require a lot of innovation in process technology, materials, circuit design, flash management algorithms, Error Correction Code and, finally, system co-design for new applications such as AI and security enforcement

    Contribution à la conception d'architecture de calcul auto-adaptative intégrant des nanocomposants neuromorphiques et applications potentielles

    Get PDF
    Dans cette thèse, nous étudions les applications potentielles des nano-dispositifs mémoires émergents dans les architectures de calcul. Nous montrons que des architectures neuro-inspirées pourraient apporter l'efficacité et l'adaptabilité nécessaires à des applications de traitement et de classification complexes pour la perception visuelle et sonore. Cela, à un cout moindre en termes de consommation énergétique et de surface silicium que les architectures de type Von Neumann, grâce à une utilisation synaptique de ces nano-dispositifs. Ces travaux se focalisent sur les dispositifs dit memristifs , récemment (ré)-introduits avec la découverte du memristor en 2008 et leur utilisation comme synapse dans des réseaux de neurones impulsionnels. Cela concerne la plupart des technologies mémoire émergentes : mémoire à changement de phase Phase-Change Memory (PCM), Conductive-Bridging RAM (CBRAM), mémoire résistive Resistive RAM (RRAM)... Ces dispositifs sont bien adaptés pour l'implémentation d'algorithmes d'apprentissage non supervisés issus des neurosciences, comme Spike-Timing-Dependent Plasticity (STDP), ne nécessitant que peu de circuit de contrôle. L'intégration de dispositifs memristifs dans des matrices, ou crossbar , pourrait en outre permettre d'atteindre l'énorme densité d'intégration nécessaire pour ce type d'implémentation (plusieurs milliers de synapses par neurone), qui reste hors de portée d'une technologie purement en Complementary Metal Oxide Semiconductor (CMOS). C'est l'une des raisons majeures pour lesquelles les réseaux de neurones basés sur la technologie CMOS n'ont pas eu le succès escompté dans les années 1990. A cela s'ajoute la relative complexité et inefficacité de l'algorithme d'apprentissage de rétro-propagation du gradient, et ce malgré tous les aspects prometteurs des architectures neuro-inspirées, tels que l'adaptabilité et la tolérance aux fautes. Dans ces travaux, nous proposons des modèles synaptiques de dispositifs memristifs et des méthodologies de simulation pour des architectures les exploitant. Des architectures neuro-inspirées de nouvelle génération sont introduites et simulées pour le traitement de données naturelles. Celles-ci tirent profit des caractéristiques synaptiques des nano-dispositifs memristifs, combinées avec les dernières avancées dans les neurosciences. Nous proposons enfin des implémentations matérielles adaptées pour plusieurs types de dispositifs. Nous évaluons leur potentiel en termes d'intégration, d'efficacité énergétique et également leur tolérance à la variabilité et aux défauts inhérents à l'échelle nano-métrique de ces dispositifs. Ce dernier point est d'une importance capitale, puisqu'il constitue aujourd'hui encore la principale difficulté pour l'intégration de ces technologies émergentes dans des mémoires numériques.In this thesis, we study the potential applications of emerging memory nano-devices in computing architecture. More precisely, we show that neuro-inspired architectural paradigms could provide the efficiency and adaptability required in some complex image/audio processing and classification applications. This, at a much lower cost in terms of power consumption and silicon area than current Von Neumann-derived architectures, thanks to a synaptic-like usage of these memory nano-devices. This work is focusing on memristive nano-devices, recently (re-)introduced by the discovery of the memristor in 2008 and their use as synapses in spiking neural network. In fact, this includes most of the emerging memory technologies: Phase-Change Memory (PCM), Conductive-Bridging RAM (CBRAM), Resistive RAM (RRAM)... These devices are particularly suitable for the implementation of natural unsupervised learning algorithms like Spike-Timing-Dependent Plasticity (STDP), requiring very little control circuitry.The integration of memristive devices in crossbar array could provide the huge density required by this type of architecture (several thousand synapses per neuron), which is impossible to match with a CMOS-only implementation. This can be seen as one of the main factors that hindered the rise of CMOS-based neural network computing architectures in the nineties, among the relative complexity and inefficiency of the back-propagation learning algorithm, despite all the promising aspects of such neuro-inspired architectures, like adaptability and fault-tolerance. In this work, we propose synaptic models for memristive devices and simulation methodologies for architectural design exploiting them. Novel neuro-inspired architectures are introduced and simulated for natural data processing. They exploit the synaptic characteristics of memristives nano-devices, along with the latest progresses in neurosciences. Finally, we propose hardware implementations for several device types. We assess their scalability and power efficiency potential, and their robustness to variability and faults, which are unavoidable at the nanometric scale of these devices. This last point is of prime importance, as it constitutes today the main difficulty for the integration of these emerging technologies in digital memories.PARIS11-SCD-Bib. électronique (914719901) / SudocSudocFranceF

    Topical Workshop on Electronics for Particle Physics

    Get PDF

    Technology 2003: Conference Proceedings from the Fourth National Technology Transfer Conference and Exposition, Volume 1

    Get PDF
    Proceedings from symposia of the Technology 2003 Conference and Exposition, December 7-9, I993, Anaheim, CA. Volume 1 features the Plenary Session and the Plenary Workshop, plus papers presented in Advanced Manufacturing, Biotechnology/Medical Technology, Environmental Technology, Materials Science, and Power and Energy

    Analysis of Wireless Body-Centric Medical Sensors for Remote Healthcare

    Get PDF
    Aquesta tesi aborda el problema de trobar solucions confortables, de baixa potència i sense fils per aplicacions mèdiques. La tesi tracta els avantatges i les limitacions de tres tecnologies de comunicació diferents per la mesura de paràmetres del cos i mètodes per redissenyar sensors per avaluacions òptimes centrades en el cos. La tecnologia RFID es considera una de les solucions més influents per superar el problema del consum d'energia limitat, a causa de la presència de molts sensors connectats. També s'ha estudiat la tecnologia Bluetooth de baixa energia per resoldre els problemes de seguretat i la distància de lectura que, en general, representen el coll d'ampolla de RFID pels sensors de cos. Els dispositius analògics poden reduir dràsticament les necessitats d'energia a causa dels sensors i les comunicacions, considerant pocs elements i un mètode de transmissió simple. S'estudia un mètode de comunicació completament passiu, basat en FSS, que permet una distància de lectura raonable amb capacitats de detecció precises i confiables, que s'ha discutit en aquesta tesi. L'objectiu d'aquesta tesi és investigar múltiples tecnologies sense fils per dispositius portàtils per identificar solucions adequades per aplicacions particulars en el camp mèdic. El primer objectiu és demostrar la facilitat d'ús de les tecnologies econòmiques sense bateria com un indicador útil de paràmetres fisiopatològics mitjançant la investigació de les propietats de les etiquetes RFID. A més a més, s'ha abordat un aspecte més complex respecte a l'ús de petits components passius com sensors sense fils per trastorns del son. Per últim, un altre objectiu de la tesi és el desenvolupament d'un sistema completament autònom que utilitzi tecnologia BLE per obtenir propietats avançades mantenint baix tant el consum com el preuEsta tesis aborda el problema de encontrar soluciones confortables, inalámbricas y de baja potencia para aplicaciones médicas. La tesis discute las ventajas y limitaciones de tres tecnologías de comunicación diferentes para la medición en el cuerpo y los métodos para elegir y remodelar los sensores para evaluaciones óptimas centradas en el cuerpo. La tecnología RFID se considera una de las soluciones más influyentes para superar el consumo de energía limitado debido a la presencia de muchos sensores conectados. Además, la baja energía de Bluetooth se ha estudiado se ha estudiado la tecnologia Bluetooth de baja energia para resolver los problemas de seguridad y la distancia de lectura que, en general, representan el cuello de botella de la RFID para los sensores de cuerpo. Los dispositivos analógicos pueden reducir drásticamente las necesidades de energía debido a los sensores y las comunicaciones, considerando pocos elementos y un método de transmisión simple. Se estudia un método de comunicación completamente pasivo, basado en FSS, que permite una distancia de lectura razonable con capacidades de detección precisas y confiables, que se ha discutido en esta tesis. El objetivo de esta tesis es investigar múltiples tecnologías inalámbricas para dispositivos portátiles para identificar soluciones adecuadas para aplicaciones particulares en campos médicos. El primer objetivo es demostrar la facilidad de uso de las tecnologías económicas sin batería como un indicador útil de dichos parámetros fisiopatológicos mediante la investigación de las propiedades de las etiquetas RFID. Además, se ha abordado un aspecto más complejo con respecto al uso de pequeños componentes pasivos como sensores inalámbricos para enfermedades del sueño. Por último, un resultado de la tesis es desarrollar un sistema completamente autónomo que utilice la tecnología BLE para obtener propiedades avanzadas que mantengan la baja potencia y un precio bajo.This thesis addresses the problem of comfortable, low powered and, wireless solutions for specific body-worn sensing. The thesis discusses advantages and limitations of three different communication technologies for on body measurement and investigate methods to reshape sensors for optimum body-centric assessments. The RFID technology is considered one of the most influential solutions to overcome the limitated power consumption due to the presence of many sensors connected. Further, the Bluetooth low energy has been studied to solve security problems and reading distance that overall represent the bottleneck of the RFID for the body-worn sensors. Analog devices can drastically reduce the energy needs due to the sensors and the communications, considering few elements and a simple transmitting method. An entirely passive communication method, based on FSS is studied, enabling a reasonable reading distance with precise and reliable sensing capabilities, which has been discussed in this thesis. The objective of this thesis is to investigate multiple wireless technologies for wearable devices to identify suitable solutions for particular applications in medical fields. The first objective is to demonstrate the usability of the inexpensive battery-less technologies as a useful indicator of such a physio-pathological parameters by investigating the properties of the RFID tags. Furthermore, a more complex aspect regards the use of small passive components as wireless sensors for sleep diseases has been addressed. Lastly, an outcome of the thesis is to develop an entirely autonomous system using the BLE technology to obtain advanced properties keeping low power and a low price

    High Performance Textiles

    Get PDF
    High-performance or hi-tech textiles represent the keystone of the present and the future for all industrial sectors, which require lightening, flexibility, and the high mechanical resistance as well as thermal stability of the materials. As described within this Special Issue, the applications of these advanced systems are innovative and also highly technological: from water-repellent to stain-resistant fabrics, from being flame-resistant to antibacterial/antifouling, from being insulating to conductive, and from environmental protection systems to smart textiles. High-performance textiles also meet all of the actual requirements of sustainability and environmental protection of modern industry

    Topical Workshop on Electronics for Particle Physics

    Get PDF
    The purpose of the workshop was to present results and original concepts for electronics research and development relevant to particle physics experiments as well as accelerator and beam instrumentation at future facilities; to review the status of electronics for the LHC experiments; to identify and encourage common efforts for the development of electronics; and to promote information exchange and collaboration in the relevant engineering and physics communities

    Three-Dimensional Processing-In-Memory-Architectures: A Holistic Tool For Modeling And Simulation

    Get PDF
    Die gemeinhin als Memory Wall bekannte, sich stetig weitende Leistungslücke zwischen Prozessor- und Speicherarchitekturen erfordert neue Konzepte, um weiterhin eine Skalierung der Rechenleistung zu ermöglichen. Da Speicher als die Beschränkung innerhalb einer Von-Neumann-Architektur identifiziert wurden, widmet sich die Arbeit dieser Problemstellung. Obgleich dreidimensionale Speicher zu einer Linderung der Memory Wall beitragen können, sind diese alleinig für die zukünftige Skalierung ungenügend. Aufgrund höherer Effizienzen stellt die Integration von Rechenkapazität in den Speicher (Processing-In-Memory, PIM) ein vielversprechender Ausweg dar, jedoch existiert ein Mangel an PIM-Simulationsmodellen. Daher wurde ein flexibles Simulationswerkzeug für dreidimensionale Speicherstapel geschaffen, welches zur Modellierung von dreidimensionalen PIM erweitert wurde. Dieses kann Speicherstapel wie etwa Hybrid Memory Cube standardkonform simulieren und bietet zugleich eine hohe Genauigkeit indem auf elementaren Datenpaketen in Kombination mit dem Hardware validierten Simulator BOBSim modelliert wird. Ein eigens entworfener Simulationstaktbaum ermöglicht zugleich eine schnelle Ausführung. Messungen weisen im funktionalen Modus eine 100-fache Beschleunigung auf, wohingegen eine Verdoppelung der Ausführungsgeschwindigkeit mit Taktgenauigkeit erzielt wird. Anhand eines eigens implementierten, binärkompatiblen GPU-Beschleunigers wird die Modellierung einer vollständig dreidimensionalen PIM-Architektur demonstriert. Dabei orientieren sich die maximalen Hardwareressourcen an einem PIM-Beschleuniger aus der Literatur. Evaluiert wird einerseits das GPU-Simulationsmodell eigenständig, andererseits als PIM-Verbund jeweils mit Hilfe einer repräsentativ gewählten, speicherbeschränkten geophysikalischen Bildverarbeitung. Bei alleiniger Betrachtung des GPU-Simulationsmodells weist dieses eine signifikant gesteigerte Simulationsgeschwindigkeit auf, bei gleichzeitiger Abweichung von 6% gegenüber dem Verilator-Modell. Nachfolgend werden innerhalb dieser Arbeit unterschiedliche Konfigurationen des integrierten PIM-Beschleunigers evaluiert. Je nach gewählter Konfiguration kann der genutzte Algorithmus entweder bis zu 140GFLOPS an tatsächlicher Rechenleistung abrufen oder eine maximale Recheneffizienz von synthetisch 30% bzw. real 24,5% erzielen. Letzteres stellt eine Verdopplung des Stands der Technik dar. Eine anknüpfende Diskussion erläutert eingehend die Resultate.The steadily widening performance gap between processor- and memory-architectures - commonly known as the Memory Wall - requires novel concepts to achieve further scaling in processing performance. As memories were identified as the limitation within a Von-Neumann-architecture, this work addresses this constraining issue. Although three-dimensional memories alleviate the effects of the Memory Wall, the sole utilization of such memories would be insufficient. Due to higher efficiencies, the integration of processing capacity into memories (so-called Processing-In-Memory, PIM) depicts a promising alternative. However, a lack of PIM simulation models still remains. As a consequence, a flexible simulation tool for three-dimensional stacked memories was established, which was extended for modeling three-dimensional PIM architectures. This tool can simulate stacked memories such as Hybrid Memory Cube standard-compliant and simultaneously offers high accuracy by modeling on elementary data packets (FLIT) in combination with the hardware validated BOBSim simulator. To this, a specifically designed simulation clock tree enables an rapid simulation execution. A 100x speed up in simulation execution can be measured while utilizing the functional mode, whereas a 2x speed up is achieved during clock-cycle accuracy mode. With the aid of a specifically implemented, binary compatible GPU accelerator and the established tool, the modeling of a holistic three-dimensional PIM architecture is demonstrated within this work. Hardware resources used were constrained by a PIM architecture from literature. A representative, memory-bound, geophysical imaging algorithm was leveraged to evaluate the GPU model as well as the compound PIM simulation model. The sole GPU simulation model depicts a significantly improved simulation performance with a deviation of 6% compared to a Verilator model. Subsequently, various PIM accelerator configurations with the integrated GPU model were evaluated. Depending on the chosen PIM configuration, the utilized algorithm achieves 140GFLOPS of processing performance or a maximum computing efficiency of synthetically 30% or realistically 24.5%. The latter depicts a 2x improvement compared to state-of-the-art. A following discussion showcases the results in depth
    corecore