Tailor: Altering Skip Connections for Resource-Efficient Inference

Denolf, Kristof; Duarte, Javier Mauricio; Kastner, Ryan; Khodamoradi, Alireza; Koushanfar, Farinaz; Loncar, Vladimir; Marcano, Gabriel; Meza, Andres; Sheybani, Nojan; Weng, Olivia

Tailor: Altering Skip Connections for Resource-Efficient Inference

Authors: Kristof Denolf
Javier Mauricio Duarte
Ryan Kastner
Alireza Khodamoradi
Farinaz Koushanfar
Vladimir Loncar
Gabriel Marcano
Andres Meza
Nojan Sheybani
Olivia Weng
Publication date: 15 September 2023
Publisher

Abstract

Deep neural networks use skip connections to improve training convergence. However, these skip connections are costly in hardware, requiring extra buffers and increasing on- and off-chip memory utilization and bandwidth requirements. In this paper, we show that skip connections can be optimized for hardware when tackled with a hardware-software codesign approach. We argue that while a network's skip connections are needed for the network to learn, they can later be removed or shortened to provide a more hardware efficient implementation with minimal to no accuracy loss. We introduce Tailor, a codesign tool whose hardware-aware training algorithm gradually removes or shortens a fully trained network's skip connections to lower their hardware cost. Tailor improves resource utilization by up to 34% for BRAMs, 13% for FFs, and 16% for LUTs for on-chip, dataflow-style architectures. Tailor increases performance by 30% and reduces memory bandwidth by 45% for a 2D processing element array architecture

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2301.07247

Last time updated on 02/02/2023