Search CORE

2 research outputs found

Prebypass: Software Register File Bypassing for Reduced Interconnection Architectures

Author: Corporaal Henk
de Bruin Barry
Jordans Roel
Jääskeläinen P.
Vadivel Kanishkan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Exposed Datapath Architectures (EDPAs) with aggressively pruned data-path connectivity, where not all function units in the design have connections to a centralized register file, are promising solutions for energy-efficient computation. A direct bypassing of data between function units without temporary copies to the register file is a prime optimization for programming such architectures. However, traditional compiler frameworks, such as LLVM, assume function-units connect to register-files and allocate all live variables in register-files. This leads to schedule inefficiencies in terms of instruction-level parallelism and reg-ister accesses in the EDPAs. To address these inefficiencies, we propose Prebypass; a new optimization pass for EDPA compiler backends. Experimental results on an EDPA class of architecture, Transport- Triggered Architecture, show that Prebypass improves the runtime, register reads, and register writes up to 16%, 26 %, and 37 % respectively, when the datapath is extremely pruned. Evaluation in a 28-nm FDSOI technology reveals that Prebypass improves the core-level Energy by 17.5 % over the current heuristic scheduler.acceptedVersionPeer reviewe

Pure OAI Repository

Trepo - Institutional Repository of Tampere University

Exposed Datapath optimizations for Loop Scheduling

Author: IJzerman Johannes
Jääskeläinen Pekka
Kultala Heikki
Lehtonen Lasse
Mäkitalo Markku
Takala Jarmo
Viitanen Timo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Transport Triggered Architecture (TTA) processors allow unique low level compiler optimizations such as software bypassing and operand sharing. Previously, these optimizations have mostly been performed inside single basic blocks, leaving much of their potential unused. In this work, software bypassing and operand sharing are integrated with loop scheduling, allowing optimizations over loop iteration boundaries. This considerably further reduces register file accesses and immediate value transfers on tight loops – in some cases even eliminating all register file accesses from the loop body. In the benchmarked 12 small loops, compared to traditional VLIW-style processors, on average 63% of register file reads and 77% of register file writes could be eliminated. Compared to a compiler which performs these optimizations only inside a basic block, on average 58% of register file reads, 28% of register file writes could be eliminated. The additional register access reductions allow both direct energy savings from fewer register accesses and indirect energy savings by allowing the use of simpler register files with less read and write ports and a simpler interconnect network with less transport buses.acceptedVersionPeer reviewe

Trepo - Institutional Repository of Tampere University