Search CORE

4 research outputs found

Reversible designs for extreme memory cost reduction of CNN training

Author: Ariki Yasuo
Febvre Quentin
Hascoet Tristan Erwan Marie
Takiguchi Tetsuya
Zhuang Weihao
アリキヤスオ
タキグチテツヤ
ハスコエトトリスタンエワンマリ
有木康雄
滝口哲也
Publication venue: SpringerOpen
Publication date: 14/09/2023
Field of study

Institutional Repositories DataBase (IRDB)

Reversible designs for extreme memory cost reduction of CNN training

Author: Ariki Yasuo
Febvre Quentin
Hascoet Tristan
Takiguchi Tetsuya
Zhuang Weihao
Publication venue: Springer
Publication date: 05/01/2023
Field of study

International audienceTraining Convolutional Neural Networks (CNN) is a resource-intensive task that requiresspecialized hardware for efficient computation. One of the most limiting bottlenecksof CNN training is the memory cost associated with storing the activation values ofhidden layers. These values are needed for the computation of the weights’ gradientduring the backward pass of the backpropagation algorithm. Recently, reversiblearchitectures have been proposed to reduce the memory cost of training large CNNby reconstructing the input activation values of hidden layers from their output duringthe backward pass, circumventing the need to accumulate these activations inmemory during the forward pass. In this paper, we push this idea to the extreme andanalyze reversible network designs yielding minimal training memory footprint. Weinvestigate the propagation of numerical errors in long chains of invertible operationsand analyze their effect on training. We introduce the notion of pixel-wise memory costto characterize the memory footprint of model training, and propose a new modelarchitecture able to efficiently train arbitrarily deep neural networks with a minimummemory cost of 352 bytes per input pixel. This new kind of architecture enablestraining large neural networks on very limited memory, opening the door for neuralnetwork training on embedded devices or non-specialized hardware. For instance, wedemonstrate training of our model to 93.3% accuracy on the CIFAR10 dataset within 67minutes on a low-end Nvidia GTX750 GPU with only 1GB of memory

HAL-Université de Bretagne Occidentale