Search CORE

40,768 research outputs found

Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add

Author: Cristal Kestelman Adrián
Palomar Pérez Óscar
Ratkovic Ivan
Stanic Milan
Unsal Osman Sabri
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and “real-world” application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using “real-world” benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P. The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Architecture and Design of Medical Processor Units for Medical Networks

Author: Ahamed Syed V.
Rahman Syed Shawon M.
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 13/04/2011
Field of study

This paper introduces analogical and deductive methodologies for the design medical processor units (MPUs). From the study of evolution of numerous earlier processors, we derive the basis for the architecture of MPUs. These specialized processors perform unique medical functions encoded as medical operational codes (mopcs). From a pragmatic perspective, MPUs function very close to CPUs. Both processors have unique operation codes that command the hardware to perform a distinct chain of subprocesses upon operands and generate a specific result unique to the opcode and the operand(s). In medical environments, MPU decodes the mopcs and executes a series of medical sub-processes and sends out secondary commands to the medical machine. Whereas operands in a typical computer system are numerical and logical entities, the operands in medical machine are objects such as such as patients, blood samples, tissues, operating rooms, medical staff, medical bills, patient payments, etc. We follow the functional overlap between the two processes and evolve the design of medical computer systems and networks.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Measuring Improvement when Using HUB Formats to Implement Floating-Point Systems under Round-to-Nearest

Author: Hormigo-Aguilar Javier
Villalba-Moreno Julio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

MEC bajo TIN2013-42253-PThis paper analyzes the benefits of using HUB formats to implement floating-point arithmetic under round-tonearest mode from a quantitative point of view. Using HUB formats to represent numbers allows the removal of the rounding logic of arithmetic units, including sticky-bit computation. This is shown for floating-point adders, multipliers, and converters. Experimental analysis demonstrates that HUB formats and the corresponding arithmetic units maintain the same accuracy as conventional ones. On the other hand, the implementation of these units, based on basic architectures, shows that HUB formats simultaneously improve area, speed, and power consumption. Specifically, based on data obtained from the synthesis, a HUB single-precision adder is about 14% faster but consumes 38% less area and 26% less power than the conventional adder. Similarly, a HUB single-precision multiplier is 17% faster, uses 22% less area, and consumes slightly less power than conventional multiplier. At the same speed, the adder and multiplier achieve area and power reductions of up to 50% and 40%, respectively

Repositorio Institucional Universidad de Málaga

Configurable 3D-integrated focal-plane sensor-processor array architecture

Author: Földesy Péter
Rekeczky Csaba
Zarándy Ákos
Publication venue: Wiley-Blackwell
Publication date: 01/01/2008
Field of study

A mixed-signal Cellular Visual Microprocessor architecture with digital processors is described. An ASIC implementation is also demonstrated. The architecture is composed of a regular sensor readout circuit array, prepared for 3D face-to-face type integration, and one or several cascaded array of mainly identical (SIMD) processing elements. The individual array elements derived from the same general HDL description and could be of different in size, aspect ratio, and computing resources

SZTAKI Publication Repository

Repository of the Academy's Library

Self-testing and repairing computer Patent

Author: Avizienis A. A.
Publication venue
Publication date: 23/06/1970
Field of study

Self testing and repairing computer comprising control and diagnostic unit and rollback points for error correctio

NASA Technical Reports Server

DFT and BIST of a multichip module for high-energy physics experiments

Author: Benso Alfredo
Chiusano Silvia Anna
Prinetto Paolo Ernesto
Publication venue: IEEE
Publication date: 01/01/2002
Field of study

Engineers at Politecnico di Torino designed a multichip module for high-energy physics experiments conducted on the Large Hadron Collider. An array of these MCMs handles multichannel data acquisition and signal processing. Testing the MCM from board to die level required a combination of DFT strategie

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Adder Based Residue to Binary Number Converters for (2n - 1; 2n; 2n + 1)

Author: Aboulhamid M.
Shen H.
Song X.
Wang Y.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Copyright © 2002 IEEEBased on an algorithm derived from the new Chinese remainder theorem I, we present three new residue-to-binary converters for the residue number system (2n-1, 2n, 2n+1) designed using 2n-bit or n-bit adders with improvements on speed, area, or dynamic range compared with various previous converters. The 2n-bit adder based converter is faster and requires about half the hardware required by previous methods. For n-bit adder-based implementations, one new converter is twice as fast as the previous method using a similar amount of hardware, whereas another new converter achieves improvement in either speed, area, or dynamic range compared with previous convertersYuke Wang, Xiaoyu Song, Mostapha Aboulhamid and Hong She

Adelaide Research & Scholarship