Search CORE

20 research outputs found

Energy-Aware Data Movement In Non-Volatile Memory Hierarchies

Author: Najafabadi Navid Khoshavi
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2017
Field of study

While technology scaling enables increased density for memory cells, the intrinsic high leakage power of conventional CMOS technology and the demand for reduced energy consumption inspires the use of emerging technology alternatives such as eDRAM and Non-Volatile Memory (NVM) including STT-MRAM, PCM, and RRAM. The utilization of emerging technology in Last Level Cache (LLC) designs which occupies a signifcant fraction of total die area in Chip Multi Processors (CMPs) introduces new dimensions of vulnerability, energy consumption, and performance delivery. To be specific, a part of this research focuses on eDRAM Bit Upset Vulnerability Factor (BUVF) to assess vulnerable portion of the eDRAM refresh cycle where the critical charge varies depending on the write voltage, storage and bit-line capacitance. This dissertation broaden the study on vulnerability assessment of LLC through investigating the impact of Process Variations (PV) on narrow resistive sensing margins in high-density NVM arrays, including on-chip cache and primary memory. Large-latency and power-hungry Sense Amplifers (SAs) have been adapted to combat PV in the past. Herein, a novel approach is proposed to leverage the PV in NVM arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time. On the other hand, this dissertation investigates a novel technique to prioritize the service to 1) Extensive Read Reused Accessed blocks of the LLC that are silently dropped from higher levels of cache, and 2) the portion of the working set that may exhibit distant re-reference interval in L2. In particular, we develop a lightweight Multi-level Access History Profiler to effciently identify ERRA blocks through aggregating the LLC block addresses tagged with identical Most Signifcant Bits into a single entry. Experimental results indicate that the proposed technique can reduce the L2 read miss ratio by 51.7% on average across PARSEC and SPEC2006 workloads. In addition, this dissertation will broaden and apply advancements in theories of subspace recovery to pioneer computationally-aware in-situ operand reconstruction via the novel Logic In Interconnect (LI2) scheme. LI2 will be developed, validated, and re?ned both theoretically and experimentally to realize a radically different approach to post-Moore\u27s Law computing by leveraging low-rank matrices features offering data reconstruction instead of fetching data from main memory to reduce energy/latency cost per data movement. We propose LI2 enhancement to attain high performance delivery in the post-Moore\u27s Law era through equipping the contemporary micro-architecture design with a customized memory controller which orchestrates the memory request for fetching low-rank matrices to customized Fine Grain Reconfigurable Accelerator (FGRA) for reconstruction while the other memory requests are serviced as before. The goal of LI2 is to conquer the high latency/energy required to traverse main memory arrays in the case of LLC miss, by using in-situ construction of the requested data dealing with low-rank matrices. Thus, LI2 exchanges a high volume of data transfers with a novel lightweight reconstruction method under specific conditions using a cross-layer hardware/algorithm approach

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Modélisation compacte et conception de circuit à base de jonction tunnel ferroélectrique et de jonction tunnel magnétique exploitant le transfert de spin assisté par effet Hall de spin

Author: Wang Zhaohao
Publication venue: HAL CCSD
Publication date: 14/10/2015
Field of study

Non-volatile memory (NVM) devices have been attracting intensive research interest since they promise to solve the increasing static power issue caused by CMOS technology scaling. This thesis focuses on two fields related to NVM: the one is the ferroelectric tunnel junction (FTJ), which is a recent emerging NVM device. The other is the spin-Hall-assisted spin-transfer torque (STT), which is a recent proposed write approach for the magnetic tunnel junction (MTJ). Our objective is to develop the compact models for these two technologies and to explore their application in the non-volatile circuits through simulation.First, we investigated physical models describing the electrical behaviors of the FTJ such as tunneling resistance, dynamic ferroelectric switching and memristive response. The accuracy of these physical models is validated by a good agreement with experimental results. In order to develop an electrical model available for the circuit simulation, we programmed the aforementioned physical models with Verilog-A language and integrated them together. The developed electrical model can run on Cadence platform (a standard circuit simulation tool) and faithfully reproduce the behaviors of the FTJ.Then, using the developed FTJ model and STMicroelectronics CMOS design kit, we designed and simulated three types of circuits: i) FTJ-based random access memory (FTRAM), ii) two FTJ-based neuromorphic systems, one of which emulates spike-timing dependent plasticity (STDP) learning rule, the other implements supervised learning of logic functions, iii) FTJ-based Boolean logic block, by which NAND and NOR logic are demonstrated. The influences of the FTJ parameters on the performance of these circuits were analyzed based on simulation results.Finally, we focused on the reversal of the perpendicular magnetization driven by spin-Hall-assisted STT in a three-terminal MTJ. In this scheme, two write currents are applied to generate spin-Hall effect (SHE) and STT. Numerical simulation based on Landau-Lifshitz-Gilbert (LLG) equation demonstrates that the incubation delay of the STT can be eliminated by the strong SHE, resulting in ultrafast magnetization switching without the need to strengthen the STT. We applied this novel write approach to the design of the magnetic flip-flop and full-adder. Performance comparison between the spin-Hall-assisted and the conventional STT magnetic circuits were discussed based on simulation results and theoretical models.Les mémoires non-volatiles (MNV) sont l'objet d'un effort de recherche croissant du fait de leur capacité à limiter la consommation statique, qui obère habituellement la réduction des dimensions dans la technologie CMOS. Dans ce contexte, cette thèse aborde plus spécifiquement deux technologies de mémoires non volatiles : d'une part les jonctions tunnel ferroélectriques (JTF), dispositif non volatil émergent, et d'autre part les dispositifs à transfert de spin (TS) assisté par effet Hall de spin (EHS), approche alternative proposée récemment pour écrire les jonctions tunnel magnétiques (JTM). Mon objectif est de développer des modèles compacts pour ces deux technologies et d'explorer, par simulation, leur intégration dans les circuits non-volatiles.J'ai d'abord étudié les modèles physiques qui décrivent les comportements électriques des JTF : la résistance tunnel, la dynamique de la commutation ferroélectrique et leur comportement memristif. La précision de ces modèles physiques est validée par leur bonne adéquation avec les résultats expérimentaux. Afin de proposer un modèle compatible avec les simulateurs électriques standards, nous j'ai développé les modèles physiques mentionnés ci-dessus en langue Verilog-A, puis je les ai intégrés ensemble. Le modèle électrique que j'ai conçu peut être exploité sur la plate-forme Cadence (un outil standard pour la simulation de circuit). Il reproduit fidèlement les comportements de JTF. Ensuite, en utilisant ce modèle de JTF et le design-kit CMOS de STMicroelectronics, j'ai conçu et simulé trois types de circuits: i) une mémoire vive (RAM) basée sur les JTF, ii) deux systèmes neuromorphiques basés sur les JTF, l'un qui émule la règle d'apprentissage de la plasticité synaptique basée sur le décalage temporel des impulsions neuronale (STDP), l'autre mettant en œuvre l'apprentissage supervisé de fonctions logiques, iii) un bloc logique booléen basé sur les JTF, y compris la démonstration des fonctions logiques NAND et NOR. L'influence des paramètres de la JTF sur les performances de ces circuits a été analysée par simulation. Finalement, nous avons modélisé la dynamique de renversement de l'aimantation dans les dispositifs à anisotropie perpendiculaire à transfert de spin assisté par effet Hall de spin dans un JTM à trois terminaux. Dans ce schéma, deux courants d'écriture sont appliqués pour générer l'EHS et le TS. La simulation numérique basée sur l'équation de Landau-Lifshitz-Gilbert (LLG) démontre que le délai d'incubation de TS peut être éliminé par un fort EHS, conduisant à la commutation ultra-rapide de l'aimantation, sans pour autant requérir une augmentation excessive du TS. Nous avons appliqué cette nouvelle méthode d'écriture à la conception d'une bascule magnétique et d'un additionneur 1 bit magnétique. Les performances des circuits magnétiques assistés par l'EHS ont été comparés à ceux écrits par transfert de spin, par simulation et par une analyse fondée sur le modèle théorique

Thèses en Ligne

Recommended from our members

Automotive embedded systems software reprogramming

Author: Schmidgall Ralf
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2012
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityThe exponential growth of computer power is no longer limited to stand alone computing systems but applies to all areas of commercial embedded computing systems. The ongoing rapid growth in intelligent embedded systems is visible in the commercial automotive area, where a modern car today implements up to 80 different electronic control units (ECUs) and their total memory size has been increased to several hundreds of megabyte. This growth in the commercial mass production world has led to new challenges, even within the automotive industry but also in other business areas where cost pressure is high. The need to drive cost down means that every cent spent on recurring engineering costs needs to be justified. A conflict between functional requirements (functionality, system reliability, production and manufacturing aspects etc.), testing and maintainability aspects is given. Software reprogramming, as a key issue within the automotive industry, solve that given conflict partly in the past. Software Reprogramming for in-field service and maintenance in the after sales markets provides a strong method to fix previously not identified software errors. But the increasing software sizes and therefore the increasing software reprogramming times will reduce the benefits. Especially if ECU’s software size growth faster than vehicle’s onboard infrastructure can be adjusted. The thesis result enables cost prediction of embedded systems’ software reprogramming by generating an effective and reliable model for reprogramming time for different existing and new technologies. This model and additional research results contribute to a timeline for short term, mid term and long term solutions which will solve the currently given problems as well as future challenges, especially for the automotive industry but also for all other business areas where cost pressure is high and software reprogramming is a key issue during products life cycle

Brunel University Research Archive

Towards Computational Efficiency of Next Generation Multimedia Systems

Author: Khan Muhammad Usman Karim
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

To address throughput demands of complex applications (like Multimedia), a next-generation system designer needs to co-design and co-optimize the hardware and software layers. Hardware/software knobs must be tuned in synergy to increase the throughput efficiency. This thesis provides such algorithmic and architectural solutions, while considering the new technology challenges (power-cap and memory aging). The goal is to maximize the throughput efficiency, under timing- and hardware-constraints

KITopen

Low Power Memory/Memristor Devices and Systems

Author
Publication venue: 'MDPI AG'
Publication date: 02/02/2023
Field of study

This reprint focusses on achieving low-power computation using memristive devices. The topic was designed as a convenient reference point: it contains a mix of techniques starting from the fundamental manufacturing of memristive devices all the way to applications such as physically unclonable functions, and also covers perspectives on, e.g., in-memory computing, which is inextricably linked with emerging memory devices such as memristors. Finally, the reprint contains a few articles representing how other communities (from typical CMOS design to photonics) are fighting on their own fronts in the quest towards low-power computation, as a comparison with the memristor literature. We hope that readers will enjoy discovering the articles within

Directory of Open Access Books (DOAB)

Circuit design for embedded memory in low-power integrated circuits

Author: Qazi Masood
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 141-152).This thesis explores the challenges for integrating embedded static random access memory (SRAM) and non-volatile memory-based on ferroelectric capacitor technology-into lowpower integrated circuits. First considered is the impact of process variation in deep-submicron technologies on SRAM, which must exhibit higher density and performance at increased levels of integration with every new semiconductor generation. Techniques to speed up the statistical analysis of physical memory designs by a factor of 100 to 10,000 relative to the conventional Monte Carlo Method are developed. The proposed methods build upon the Importance Sampling simulation algorithm and efficiently explore the sample space of transistor parameter fluctuation. Process variation in SRAM at low-voltage is further investigated experimentally with a 512kb 8T SRAM test chip in 45nm SOI CMOS technology. For active operation, an AC coupled sense amplifier and regenerative global bitline scheme are designed to operate at the limit of on current and off current separation on a single-ended SRAM bitline. The SRAM operates from 1.2 V down to 0.57 V with access times from 400ps to 3.4ns. For standby power, a data retention voltage sensor predicts the mismatch-limited minimum supply voltage without corrupting the contents of the memory. The leakage power of SRAM forces the chip designer to seek non-volatile memory in applications such as portable electronics that retain significant quantities of data over long durations. In this scenario, the energy cost of accessing data must be minimized. This thesis presents a ferroelectric random access memory (FRAM) prototype that addresses the challenges of sensing diminishingly small charge under conditions favorable to low access energy with a time-to-digital sensing scheme. The 1 Mb IT1C FRAM fabricated in 130 nm CMOS operates from 1.5 V to 1.0 V with corresponding access energy from 19.2 pJ to 9.8 pJ per bit. Finally, the computational state of sequential elements interspersed in CMOS logic, also restricts the ability to power gate. To enable simple and fast turn-on, ferroelectric capacitors are integrated into the design of a standard cell register, whose non-volatile operation is made compatible with the digital design flow. A test-case circuit containing ferroelectric registers exhibits non-volatile operation and consumes less than 1.3 pJ per bit of state information and less than 10 clock cycles to save or restore with no minimum standby power requirement in-between active periods.by Masood Qazi.Ph.D

DSpace@MIT

Diseño e implementación de la placa de circuito impreso LoMo del nanosatélite TEIDESAT-I

Author: Pérez Del Pino Salvador
Publication venue
Publication date
Field of study

Repositorio Institucional de la Universidad de La Laguna

Spin-polarised currents and magnetic domain walls

Author: Marrows C.H.
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2005
Field of study

Electrical currents flowing in ferromagnetic materials are spin-polarised as a result of the spin-dependent band structure. When the spatial direction of the polarisation changes, in a domain structure, the electrons must somehow accommodate the necessary change in direction of their spin angular momentum as they pass through the wall. Reflection, scattering, or a transfer of angular momentum onto the lattice are all possible outcomes, depending on the circumstances. This gives rise to a variety of different physical effects, most importantly a contribution to the electrical resistance caused by the wall, and a motion of the wall driven by the spin-polarised current

White Rose Research Online

DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks

Author: Fernandez Ivan
Ghose Saugata
Gómez-Luna Juan
Mutlu Onur
Oliveira Geraldo F.
Orosa Lois
Sadrosadati Mohammad
Vijaykumar Nandita
Publication venue
Publication date: 01/01/2021
Field of study

Data movement between the CPU and main memory is a first-order obstacle against improving performance, scalability, and energy efficiency in modern systems. Computer systems employ a range of techniques to reduce overheads tied to data movement, spanning from traditional mechanisms (e.g., deep multi-level cache hierarchies, aggressive hardware prefetchers) to emerging techniques such as Near-Data Processing (NDP), where some computation is moved close to memory. Our goal is to methodically identify potential sources of data movement over a broad set of applications and to comprehensively compare traditional compute-centric data movement mitigation techniques to more memory-centric techniques, thereby developing a rigorous understanding of the best techniques to mitigate each source of data movement. With this goal in mind, we perform the first large-scale characterization of a wide variety of applications, across a wide range of application domains, to identify fundamental program properties that lead to data movement to/from main memory. We develop the first systematic methodology to classify applications based on the sources contributing to data movement bottlenecks. From our large-scale characterization of 77K functions across 345 applications, we select 144 functions to form the first open-source benchmark suite (DAMOV) for main memory data movement studies. We select a diverse range of functions that (1) represent different types of data movement bottlenecks, and (2) come from a wide range of application domains. Using NDP as a case study, we identify new insights about the different data movement bottlenecks and use these insights to determine the most suitable data movement mitigation mechanism for a particular application. We open-source DAMOV and the complete source code for our new characterization methodology at https://github.com/CMU-SAFARI/DAMOV.Comment: Our open source software is available at https://github.com/CMU-SAFARI/DAMO

arXiv.org e-Print Archive

Repository for Publications and Research Data

Directory of Open Access Journals

Recommended from our members

Picosecond Spin Seebeck Effect in Magnetic Insulators

Author: Kholid Farhan Nur
Publication venue: University of Cambridge
Publication date: 14/10/2020
Field of study

The spin Seebeck effect (SSE) refers to spin current generation in a magnetic insulator/non-magnetic metal heterostructure due to a thermal gradient and most SSE experiments typically use an electric current as heat source. This study uses Terahertz (THz) time-domain emission spectroscopy where a femtosecond laser induces a thermal gradient that lasts for only a few picoseconds and the SSE signal manifests as broadband THz electric field radiation. The ultra-fast laser excitation and detection technique enable focusing the analysis on the interfacial contributions of the SSE, minimising contributions from the bulk magnet which dominate instead in the electronic measurements. The first part of the thesis (Chapter 3) discusses the steps and calibration procedures involved in building the THz emission spectroscopy setup. We describe the process of optimising the setup sensitivity, aiming at reaching the same performance allowed in the THz setups present in other groups, and discuss potential improvements. The second part of this thesis describes measurements of the picosecond SSE in different magnetic states: ferrimagnetic, antiferromagnetic, and paramagnetic. Probing the picosecond SSE in well-studied ferrimagnetic YIG at low temperatures (Chapter 4) points out key differences with respect to previous electrical measurements done at low frequency. Comparing two antiferromagnets from the same magnetic class but with very different values of the antiferromagnetic resonant frequencies (Chapter 5 and Chapter 6) provides insights on the role of magnon energy dispersion in determining the efficiency of interfacial spin transport. Measuring the picosecond SSE in an antiferromagnet with magnetic transition temperature at 117 K allows a comparison of the spin emission in the ordered and paramagnetic phase (Chapter 6). Finally, we observe extremely strong SSE signals from the antiferromagnets despite the low field-induced magnetic moment and we attribute this to strong interfacial exchange coupling and high-frequency spin susceptibility.This PhD was funded by the Jardine Foundation and the Cambridge Trust

Apollo (Cambridge)