43 research outputs found

    An Error-Based Approximation Sensing Circuit for Event-Triggered, Low Power Wearable Sensors

    Get PDF
    Event-based sensors have the potential to optimize energy consumption at every stage in the signal processing pipeline, including data acquisition, transmission, processing and storage. However, almost all state-of-the-art systems are still built upon the classical Nyquist-based periodic signal acquisition. In this work, we design and validate the Polygonal Approximation Sampler (PAS), a novel circuit to implement a general-purpose event-based sampler using a polygonal approximation algorithm as the underlying sampling trigger. The circuit can be dynamically reconfigured to produce a coarse or a detailed reconstruction of the analog input, by adjusting the error threshold of the approximation. The proposed circuit is designed at the Register Transfer Level and processes each input sample received from the ADC in a single clock cycle. The PAS has been tested with three different types of archetypal signals captured by wearable devices (electrocardiogram, accelerometer and respiration data) and compared with a standard periodic ADC. These tests show that single-channel signals, with slow variations and constant segments (like the used single-lead ECG and the respiration signals) take great advantage from the used sampling technique, reducing the amount of data used up to 99% without significant performance degradation. At the same time, multi-channel signals (like the six-dimensional accelerometer signal) can still benefit from the designed circuit, achieving a reduction factor up to 80% with minor performance degradation. These results open the door to new types of wearable sensors with reduced size and higher battery lifetime

    RRAM Crossbar Arrays for Storage Class Memory Applications : Throughput and Density Considerations

    Get PDF
    As more and more high density memories are required to satisfy the Internet of Things ecosystem, academics and industrials are looking for an intermediate solution to fill the gap between DRAM and Flash NAND in the memory hierarchy. The emergence of Resistive Switching Technologies (RRAM) proposes a potential solution to this demand for fast, low cost, high density and non-volatile memory. However, nowadays transistor-less RRAM-based architectures, such as Crosspoint, suffers of several issues such as sneakpath, IRdrop and periphery overhead. In this work, we propose to explore the positioning of RRAM crosspoint memories regarding DRAM and NAND in terms of density and write throughput. We present several design guidelines then show that for the optimal RRAM crosspoint architecture (2-layers with common bitline), massively multiple bank write is the solution to optimize density and write throughput to around 20-100Gbit/cm2 and 200-500MB/s respectively for 32 to 64 parallel access

    BLADE: A BitLine Accelerator for Devices on the Edge

    Get PDF
    The increasing ubiquity of edge devices in the consumer market, along with their ever more computationally expensive workloads, necessitate corresponding increases in computing power to support such workloads. In-memory computing is attractive in edge devices as it reuses preexisting memory elements, thus limiting area overhead. Additionally, in-SRAM Computing (iSC) efficiently performs computations on spatially local data found in a variety of emerging edge device workloads. We therefore propose, implement, and benchmark BLADE, a BitLine Accelerator for Devices on the Edge. BLADE is an iSC architecture that can perform massive SIMD-like complex operations on hundreds to thousands of operands simultaneously. We implement BLADE in 28nm CMOS and demonstrate its functionality down to 0.6V, lower than any conventional state-of-the-art iSC architecture. We also benchmark BLADE in conjunction with a full Linux software stack in the gem5 architectural simulator, providing a robust demonstration of its performance gain in comparison to an equivalent embedded processor equipped with a NEON SIMD co-processor. We benchmark BLADE with three emerging edge device workloads, namely cryptography, high efficiency video coding, and convolutional neural networks, and demonstrate 4x, 6x, and 3x performance improvement, respectively, in comparison to a baseline CPU/NEON processor at an equivalent power budget

    A Fast, Reliable and Wide-voltage-range In-memory Computing Architecture

    Get PDF
    In-Memory Computing (IMC) solutions, and particularly bitline computing in SRAM, appear promising as they mitigate one of the most energy consuming aspects in computation: data movement. In this work we propose a fast (2.4Ghz for bitwise operations and 2.4/1.5Ghz for 32/64bit additions respectively), reliable (no read disturb issues) and wide voltage range (from 0.6 to 1V) 6T SRAM-based IMC architecture using local bitlines and a fast carry adder pitched below the memory subarray. We verify the proposed architecture through layout and variability aware simulations in 28nm bulk high performances technology PDK

    RRAMSpec: A Design Space Exploration Framework for High Density Resistive RAM

    Get PDF
    Resistive RAM (RRAM) is a promising emerging Non-Volatile Memory candidate due to its scalability and CMOS compatibility, which enables the fabrication of high density RRAM crossbar arrays in Back-End-Of-Line CMOS processes. Fast and accurate architectural models of RRAM crossbar devices are required to perform system level design space explorations of new Storage Class Memory (SCM) architectures using RRAM e.g. Non-Volatile-DIMM-P (NVDIMM-P). The major challenge in architectural modeling is the trade-off between accuracy and computing intensity. In this paper we present RRAMSpec, an architecture design space exploration framework, which enables fast exploration of various architectural trade-offs in designing high density RRAM devices, at accuracy levels close to circuit level simulators. The framework estimates silicon area, timings, and energy for RRAM devices. It outperforms state-of-the-art RRAM modeling tools by conducting architectural explorations at very high accuracy levels within few seconds of execution time. Our evaluations show various trade-offs in designing RRAM crossbar arrays with respect to array sizes, write time and write energy. Finally we present the influence of technology scaling on different RRAM design trade-offs

    Switching event detection and self-termination programming circuit for energy efficient ReRAM memory arrays

    Get PDF
    Energy efficiency remains a challenge for the design of non-volatile resistive memories (ReRAMs) arrays. This memory technology suffers from intrinsic variability in switching speed, programming voltages and resistance levels. The programming conditions of memory elements (e.g. pulse widths and amplitudes) must cover the tail bits to avoid programming failures. Switching time of ReRAMs shows wide distributions. Therefore, fast cells are subjects for electrical stress after their switching and energy waste since programming currents are typically large for this technology (tens of ”A). In this paper, we present a Write Termination (WT) circuit to stop the programming operation when the switching event occurs in the selected memory element. The proposed design is sensitive to current variations that take place when the memory element switches between two different resistance states (LRS and HRS). This WT scheme reduces the power consumption by 97+%, 93+% and 65+% during Forming, RESET and SET operations respectively. Our estimations show that area efficiency of 70% for a memory array is achievable when the presented WT circuit is integrated in near-memory peripheries. The demonstrated WT circuit is suitable for different ReRAM technologies with small overhead penalty and shows robustness against CMOS and ReRAM variabilities

    Graphene-based Wireless Agile Interconnects for Massive Heterogeneous Multi-chip Processors

    Full text link
    The main design principles in computer architecture have recently shifted from a monolithic scaling-driven approach to the development of heterogeneous architectures that tightly co-integrate multiple specialized processor and memory chiplets. In such data-hungry multi-chip architectures, current Networks-in-Package (NiPs) may not be enough to cater to their heterogeneous and fast-changing communication demands. This position paper makes the case for wireless in-package nanonetworking as the enabler of efficient and versatile wired-wireless interconnect fabrics for massive heterogeneous processors. To that end, the use of graphene-based antennas and transceivers with unique frequency-beam reconfigurability in the terahertz band is proposed. The feasibility of such a nanonetworking vision and the main research challenges towards its realization are analyzed from the technological, communications, and computer architecture perspectives.Comment: 8 pages, 4 figures, 1 table - Accepted at IEEE Wireless Communications Magazin

    Mémoires 3D haute densité à base de technologies résistives : architecture et circuit

    No full text
    Alors que les mĂ©moires non-volatiles conventionnelles, telles que les mĂ©moires flash Ă  grille flottante, deviennent de plus en plus complexes Ă  intĂ©grer et souffrent de performances et d’une fiabilitĂ© de plus en plus rĂ©duite, les mĂ©moires Ă  variation de rĂ©sistance (RRAM) telles que les OxRAM, CBRAM, MRAM ou PCM sont vues dans la communautĂ© scientifique comme une alternative crĂ©dible. Cependant, les architectures de RRAM standard (telles que la 1Transistor-1RRAM) ne sont pas compĂ©titives avec les mĂ©moires flash sur le terrain de la densitĂ©. Ainsi, cette thĂšse se propose d’explorer le potentiel des architectures RRAM sans transistor que sont l’architecture Crosspoint et l’architecture VRRAM.Dans un premier temps, le positionnement des architectures Crosspoint et VRRAM dans la hiĂ©rarchie mĂ©moire est Ă©tudiĂ©. De nouvelles problĂ©matiques, telles que les courant de sneakpath, la chute de tension dans les mĂ©taux ou la surface des circuits pĂ©riphĂ©riques sont identifiĂ©es et modĂ©lisĂ©es. Dans un second temps, des solutions circuit rĂ©pondant aux problĂ©matiques Ă©voquĂ©es prĂ©cĂ©demment sont proposĂ©es. Finalement, cette thĂšse se propose d’explorer les opportunitĂ©s ouvertes par l’utilisation de transistors innovants pour amĂ©liorer la densitĂ© ou les performances des architectures mĂ©moires utilisant des RRAM.While conventional non-volatiles memories, such as floating gate Flash memories, are becoming more and more difficult and costly to integrate and suffer of reduced performances and reliability, emerging resistive switching memories (RRAM), such as OxRAM, CBRAM, MRAM or PCM, are seen in the scientific community as a good way for tomorrow’s high-density memories. However, standard RRAM architectures (such as 1 Transistor-1 RRAM) are not competitive with flash technology in terms of density. Thereby, this thesis proposes to explore the opportunities opened by transistor-less RRAM architectures: Crosspoint and Vertical RRAM (VRRAM) architectures.First, the positioning of Crosspoint and VRRAM architectures in the memory hierarchy is studied. New constraints such as the sneakpath currents, the voltage drop through the metal lines or the periphery area overhead are identified and modeled. In a second time, circuit solutions answering to previously mentioned effects are proposed. Finally, this thesis proposes to explore new opportunities opened by the use of innovative transistors to improve the density or the performances of RRAM-based memory architectures

    A design framework for thermal-aware power delivery network in 3D MPSoCs with integrated flow cell arrays

    No full text
    Integrated Flow Cell Array (FCA) technology promises to address the power delivery and heat dissipation challenges in three-dimensional Multi-Processor Systems-on-Chips (3D MPSoCs) by providing combined inter-tier liquid cooling and power generation capabilities. In this paper, we present for the first time a design framework to accurately model the temperature-aware power delivery network in 3D MPSoCs, and quantify the effects of FCAs on the voltage drop (IR-drop). This framework estimates the power generation variation along FCAs due to voltage and temperature, in the case of uniform and non-uniform powermaps from several real processor traces. Furthermore, we explore different 3D MPSoC configurations to quantify their power delivery requirements. Our results show that FCAs improve the IR-drop with respect to state-of-the-art design methods up to 53% and 30% for dies with a power consumption of 60W and 190W, respectively, while maintaining their peak temperatures below 52°C, and at no additional Through Silicon Via (TSV) area overhead. In addition, as the presence of high power density regions (hotspots) can decrease the FCAs IR-drop reduction by up to 21% with respect to the average value, we present a scalable TSV placement optimization methodology using the proposed framework. This methodology minimizes the IR-drop at hotspots and guarantees an optimal and uniform exploitation of the IR-drop reduction benefits of FCAs
    corecore