Search CORE

273 research outputs found

Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add

Author: Cristal Kestelman Adrián
Palomar Pérez Óscar
Ratkovic Ivan
Stanic Milan
Unsal Osman Sabri
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and “real-world” application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using “real-world” benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P. The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

RISC-V PROCESSOR PERFORMANCE ANALYSIS OF SECURE DESIGN PRINCIPLES

Author: Shin Roy S.
Publication venue: Monterey, CA; Naval Postgraduate School
Publication date: 01/12/2023
Field of study

This project explores processor microarchitecture features that impact security and performance by conceptualizing and describing a RISC-V processor design with security as the priority.We begin by evaluating causes of several key classes of security vulnerabilities and then considering alternative architectures that address principal causes. We implemented portions of our design in SystemVerilog and demonstrated the functionality and performance of implemented features through simulation. Instantiation efforts are limited to microarchitecture design and writing register-transfer level (RTL) descriptions of the processor; formal verification, synthesis, and fabrication steps are specifically excluded.Specifically, we implemented a single-core RISC-V processor with a modified Harvard architecture for improved isolation of memory resources between privilege levels. Our implementation also mitigates side-channel attacks by avoiding data-dependent timing and adding power obfuscating features. We found that these changes reduced IPC performance by 55%, due to the increased impact of memory latency while eliminating most security vulnerabilities due to cache timing, branch prediction, and power analysis.Approved for public release. Distribution is unlimited.Captain, United States Marine CorpsNCWD

Calhoun, Institutional Archive of the Naval Postgraduate School

Implementing Legba: Fine-Grained Memory Protection

Author: Sheffield
Sowell
Wilson
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2007
Field of study

Fine-grained hardware protection could provide a powerful and effective means for isolating untrusted code. However, previous techniques for providing fine-grained protection in hardware have lead to poor performance. Legba has been proposed as a new caching architecture, designed to reduce the granularity of protection, without slowing down the processor. Unfortunately, the designers of Legba have not attempted an implementation. Instead, all of their analysis is based purely on simulations. We present an implementation of the Legba design on a MIPS Core Processor, along with an analysis of our observations and results

Washington University St. Louis: Open Scholarship

Coupling Latency-Insensitivity with Variable-Latency for Better Than Worst Case Design: A RISC Case Study

Author: Casu Mario Roberto
Colazzo Stefano
Mantovani P.
Publication venue: ACM
Publication date: 01/01/2011
Field of study

The gap between worst and typical case delays is bound to increase in nanometer scale technologies due to the spread in process manufacturing parameters. To still profit from scaling, designs should tolerate worst case delays seamlessly and with a minimum performance degradation with respect to the typical case. We present a simple RISC core which tolerates worst case extra latency using the Latency-Insensitive Design approach coupled to a Variable-Latency mechanism. Stalls caused by excessive delay, by data and control hazards and by late memory access are dealt with in a uniform way. Compared to a pure worst-case approach, our design method permits to increase the core clock frequency by 23% in a 45 nm CMOS technology, without area and power penalty

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino