Search CORE

1,540 research outputs found

Multi-port Memory Design for Advanced Computer Architectures

Author: Zhao Yirong
Publication venue
Publication date: 24/09/2013
Field of study

In this thesis, we describe and evaluate novel memory designs for multi-port on-chip and off-chip use in advanced computer architectures. We focus on combining multi-porting and evaluating the performance over a range of design parameters. Multi-porting is essential for caches and shared-data systems, especially multi-core System-on-chips (SOC). It can significantly increase the memory access throughput. We evaluate FinFET voltage-mode multi-port SRAM cells using different metrics including leakage current, static noise margin and read/write performance. Simulation results show that single-ended multi-port FinFET SRAMs with isolated read ports offer improved read stability and flexibility over classical double-ended structures at the expense of write performance. By increasing the size of the access transistors, we show that the single-ended multi-port structures can achieve equivalent write performance to the classical double-ended multi-port structure for 9% area overhead. Moreover, compared with CMOS SRAM, FinFET SRAM has better stability and standby power. We also describe new methods for the design of FinFET current-mode multi-port SRAM cells. Current-mode SRAMs avoid the full-swing of the bitline, reducing dynamic power and access time. However, that comes at the cost of voltage drop, which compromises stability. The design proposed in this thesis utilizes the feature of Independent Gate (IG) mode FinFET, which can leverage threshold voltage by controlling the back gate voltage, to merge two transistors into one through high-Vt and low-Vt transistors. This design not only reduces the voltage drop, but it also reduces the area in multi-port current-mode SRAM design. For off-chip memory, we propose a novel two-port 1-read, 1-write (1R1W) phasechange memory (PCM) cell, which significantly reduces the probability of blocking at the bank levels. Different from the traditional PCM cell, the access transistors are at the top and connected to the bitline. We use Verilog-A to model the behavior of Ge2Sb2Te5 (GST: the storage component). We evaluate the performance of the two-port cell by transistor sizing and voltage pumping. Simulation results show that pMOS transistor is more practical than nMOS transistor as the access device when both area and power are considered. The estimated area overhead is 1.7�, compared to single-port PCM cell. In brief, the contribution we make in this thesis is that we propose and evaluate three different kinds of multi-port memories that are favorable for advanced computer architectures

A Construction Kit for Efficient Low Power Neural Network Accelerator Designs

Author: Azarkhish Erfan
Benini Luca
Bonetti Andrea
Emery Stephane
Jokic Petar
Pons Marc
Publication venue
Publication date: 24/06/2021
Field of study

Implementing embedded neural network processing at the edge requires efficient hardware acceleration that couples high computational performance with low power consumption. Driven by the rapid evolution of network architectures and their algorithmic features, accelerator designs are constantly updated and improved. To evaluate and compare hardware design choices, designers can refer to a myriad of accelerator implementations in the literature. Surveys provide an overview of these works but are often limited to system-level and benchmark-specific performance metrics, making it difficult to quantitatively compare the individual effect of each utilized optimization technique. This complicates the evaluation of optimizations for new accelerator designs, slowing-down the research progress. This work provides a survey of neural network accelerator optimization approaches that have been used in recent works and reports their individual effects on edge processing performance. It presents the list of optimizations and their quantitative effects as a construction kit, allowing to assess the design choices for each building block separately. Reported optimizations range from up to 10'000x memory savings to 33x energy reductions, providing chip designers an overview of design choices for implementing efficient low power neural network accelerators

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Ultra-Low-Power Embedded SRAM Design for Battery- Operated and Energy-Harvested IoT Applications

Author: Banerjee Arijit
Publication venue: 'IntechOpen'
Publication date: 20/06/2018
Field of study

Internet of Things (IoT) devices such as wearable health monitors, augmented reality goggles, home automation, smart appliances, etc. are a trending topic of research. Various IoT products are thriving in the current electronics market. The IoT application needs such as portability, form factor, weight, etc. dictate the features of such devices. Small, portable, and lightweight IoT devices limit the usage of the primary energy source to a smaller rechargeable or non-rechargeable battery. As battery life and replacement time are critical issues in battery-operated or partially energy-harvested IoT devices, ultra-low-power (ULP) system on chips (SoC) are becoming a widespread solution of chip makers’ choice. Such ULP SoC requires both logic and the embedded static random access memory (SRAM) in the processor to operate at very low supply voltages. With technology scaling for bulk and FinFET devices, logic has demonstrated to operate at low minimum operating voltages (VMIN). However, due to process and temperature variation, SRAMs have higher VMIN in scaled processes that become a huge problem in designing ULP SoC cores. This chapter discusses the latest published circuits and architecture techniques to minimize the SRAM VMIN for scaled bulk and FinFET technologies and improve battery life for ULP IoT applications