Search CORE

148 research outputs found

Recommended from our members

A RISC-V Vector Processor With Simultaneous-Switching Switched-Capacitor DC-DC Converters in 28 nm FDSOI

Author: Alon E
Asanović K
Avizienis R
Bailey S
Blagojević M
Chen PH
Chiu PF
Flatresse P
Jevtić R
Keller B
Kwak J
Le HP
Lee Y
Nikolić B
Puggelli A
Richards B
Sutardja N
Waterman A
Zimmer B
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

This work demonstrates a RISC-V vector microprocessor implemented in 28 nm FDSOI with fully integrated simultaneous-switching switched-capacitor DC-DC (SC DC-DC) converters and adaptive clocking that generates four on-chip voltages between 0.45 and 1 V using only 1.0 V core and 1.8 V IO voltage inputs. The converters achieve high efficiency at the system level by switching simultaneously to avoid charge-sharing losses and by using an adaptive clock to maximize performance for the resulting voltage ripple. Details about the implementation of the DC-DC switches, DC-DC controller, and adaptive clock are provided, and the sources of conversion loss are analyzed based on measured results. This system pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20 ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80%-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices

eScholarship - University of California

Cross-layer system reliability assessment framework for hardware faults

Author: Bosio Alberto
Canal Corretger Ramon
Chatzidimitriou Athanansios
Di Carlo Stefano
Di Natale Giorgio
Gizipoulos Dimitris
González Colás Antonio María
Kaliorakis Manolis
Kooli Maha
Politano Gianfranco
Riera Villanueva Marc
Savino Alessandro
Tselonis Sotiris
Vallero Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction layers (technology, circuit, microarchitecture, software) are factored in such an estimation model, the delivered reliability reports must be excessively pessimistic and thus lead to unacceptably expensive, over-designed systems. We propose a scalable, cross-layer methodology and supporting suite of tools for accurate but fast estimations of computing systems reliability. The backbone of the methodology is a component-based Bayesian model, which effectively calculates system reliability based on the masking probabilities of individual hardware and software components considering their complex interactions. Our detailed experimental evaluation for different technologies, microarchitectures, and benchmarks demonstrates that the proposed model delivers very accurate reliability estimations (FIT rates) compared to statistically significant but slow fault injection campaigns at the microarchitecture level.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

SCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering

Author: Chandrakasan Anantha P.
Chen Chia-Hsin
Daya Bhavya Kishor
Holt Jim
Krishna Tushar
Kwon Woo Cheol
Park Sunghyun
Peh Li-Shiuan
Subramanian Suvinay
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/06/2014
Field of study

URL to conference programIn the many-core era, scalable coherence and on-chip interconnects are crucial for shared memory processors. While snoopy coherence is common in small multicore systems, directory-based coherence is the de facto choice for scalability to many cores, as snoopy relies on ordered interconnects which do not scale. However, directory-based coherence does not scale beyond tens of cores due to excessive directory area overhead or inaccurate sharer tracking. Prior techniques supporting ordering on arbitrary unordered networks are impractical for full multicore chip designs. We present SCORPIO, an ordered mesh Network-on-Chip(NoC) architecture with a separate fixed-latency, bufferless network to achieve distributed global ordering. Message delivery is decoupled from the ordering, allowing messages to arrive in any order and at any time, and still be correctly ordered. The architecture is designed to plug-and-play with existing multicore IP and with practicality, timing, area, and power as top concerns. Full-system 36 and 64-core simulations on SPLASH-2 and PARSEC benchmarks show an average application run time reduction of 24.1% and 12.9%, in comparison to distributed directory and AMD HyperTransport coherence protocols, respectively. The SCORPIO architecture is incorporated in an 11 mm-by- 13 mm chip prototype, fabricated in IBM 45nm SOI technology, comprising 36 Freescale e200 Power Architecture TM cores with private L1 and L2 caches interfacing with the NoC via ARM AMBA, along with two Cadence on-chip DDR2 controllers. The chip prototype achieves a post synthesis operating frequency of 1 GHz (833 MHz post-layout) with an estimated power of 28.8 W (768 mW per tile), while the network consumes only 10% of tile area and 19 % of tile power.United States. Defense Advanced Research Projects Agency (DARPA UHPC grant at MIT (Angstrom))Center for Future Architectures ResearchMicroelectronics Advanced Research Corporation (MARCO)Semiconductor Research Corporatio

DSpace@MIT

Crossref

Chōteidenryoku daikibo shūseki kairo no tame no denryoku kōritsu no takai kiban baiasu seigyo

Author: Okuhara Hayate
オクハラハヤテ
奥原颯
Publication venue: 慶應義塾大学大学院理工学研究科
Publication date
Field of study

KeiO Academic Resource Archive

Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node

Author: Benini Luca
Conti Francesco
Di Mauro Alfio
Rossi Davide
Schiavone Pasquale Davide
Publication venue
Publication date: 01/01/2020
Field of study

Binary Neural Networks (BNNs) have been shown to be robust to random bit-level noise, making aggressive voltage scaling attractive as a power-saving technique for both logic and SRAMs. In this work, we introduce the first fully programmable IoT end-node system-on-chip (SoC) capable of executing software-defined, hardware-accelerated BNNs at ultra-low voltage. Our SoC exploits a hybrid memory scheme where error-vulnerable SRAMs are complemented by reliable standard-cell memories to safely store critical data under aggressive voltage scaling. On a prototype in 22nm FDX technology, we demonstrate that both the logic and SRAM voltage can be dropped to 0.5Vwithout any accuracy penalty on a BNN trained for the CIFAR-10 dataset, improving energy efficiency by 2.2X w.r.t. nominal conditions. Furthermore, we show that the supply voltage can be dropped to 0.42V (50% of nominal) while keeping more than99% of the nominal accuracy (with a bit error rate ~1/1000). In this operating point, our prototype performs 4Gop/s (15.4Inference/s on the CIFAR-10 dataset) by computing up to 13binary ops per pJ, achieving 22.8 Inference/s/mW while keeping within a peak power envelope of 674uW - low enough to enable always-on operation in ultra-low power smart cameras, long-lifetime environmental sensors, and insect-sized pico-drones.Comment: Submitted to ISICAS2020 journal special issu

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Dynamic Thermal Management Of Vertically Stacked Heterogeneous Processors

Author: Sharma Ajay
Publication venue: eGrove
Publication date: 01/01/2016
Field of study

eGrove (Univ. of Mississippi)

Reinventing Integrated Photonic Devices and Circuits for High Performance Communication and Computing Applications

Author: Karempudi Venkata Sai Praneeth
Publication venue: UKnowledge
Publication date: 01/01/2024
Field of study

The long-standing technological pillars for computing systems evolution, namely Moore\u27s law and Von Neumann architecture, are breaking down under the pressure of meeting the capacity and energy efficiency demands of computing and communication architectures that are designed to process modern data-centric applications related to Artificial Intelligence (AI), Big Data, and Internet-of-Things (IoT). In response, both industry and academia have turned to \u27more-than-Moore\u27 technologies for realizing hardware architectures for communication and computing. Fortunately, Silicon Photonics (SiPh) has emerged as one highly promising ‘more-than-Moore’ technology. Recent progress has enabled SiPh-based interconnects to outperform traditional electrical interconnects, offering advantages like high bandwidth density, near-light speed data transfer, distance-independent bitrate, and low energy consumption. Furthermore, SiPh-based electro-optic (E-O) computing circuits have exhibited up to two orders of magnitude improvements in performance and energy efficiency compared to their electronic counterparts. Thus, SiPh stands out as a compelling solution for creating high-performance and energy-efficient hardware for communication and computing applications. Despite their advantages, SiPh-based interconnects face various design challenges that hamper their reliability, scalability, performance, and energy efficiency. These include limited optical power budget (OPB), high static power dissipation, crosstalk noise, fabrication and on-chip temperature variations, and limited spectral bandwidth for multiplexing. Similarly, SiPh-based E-O computing circuits also face several challenges. Firstly, the E-O circuits for simple logic functions lack the all-electrical input handling, raising hardware area and complexity. Secondly, the E-O arithmetic circuits occupy vast areas (at least 100x) while hardly achieving more than 60% hardware utilization, versus CMOS implementations, leading to high idle times, and non-amortizable area and static power overheads. Thirdly, the high area overhead of E-O circuits hinders them from achieving high spatial parallelism on-chip. This is because the high area overhead limits the count of E-O circuits that can be implemented on a reticle-size limited chip. My research offers significant contributions to address the aforementioned challenges. For SiPh-based interconnects, my contributions focus on enhancing OPB by mitigating crosstalk noise, addressing the optical non-linearity-related issues through the development of Silicon-on-Sapphire-based photonic interconnects, exploring multi-level signaling, and evaluating various device-level design pathways. This enables the design of high throughput (\u3e1Tbps) and energy-efficient (\u3c1pJ/bit) SiPh interconnects. In the context of SiPh-based E-O circuits, my contributions include the design of a microring-based polymorphic E-O logic gate, a hybrid time-amplitude analog optical modulator, and an indium tin oxide-based silicon nitride microring modulator and a weight bank for neural network computations. These designs significantly reduce the area overhead of current E-O computing circuits while enhancing the energy-efficiency, and hardware utilization

University of Kentucky

Power Profiling Model for RISC-V Core

Author: Zhang Lin
Publication venue
Publication date: 12/06/2023
Field of study

The reduction of power consumption is considered to be a critical factor for efficient computation of microprocessors. Therefore, it is necessary to implement a power management system that is aware of the computational load of the CPU cores. To enable such power management, this project aims to develop a power profiling model for the RISC-V core. TheSyDeKick verification environment was used to develop the power profiling models. Additionally, Python-controlled mixed mode simulations of C-programs compiled for A-Core were conducted to obtain needed data for the power profiling of the digital circuitry. The proposed methodology could employ a time-varying power consumption profiling for the A-Core RISC-V microprocessor core which depends on software, voltage, and clock frequency. The results of this project allow for the creation of parameterized power profiles for the A-Core, which can contribute to more efficient and sustainable computing

Aaltodoc Publication Archive