Search CORE

3 research outputs found

Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

Author: Beaumont Olivier
Eyraud Lionel
Gusak Julia
Hellard Théotime Le
Zhao Xunyi
Publication venue
Publication date: 03/07/2023
Field of study

We propose Rockmate to control the memory requirements when training PyTorch DNN models. Rockmate is an automatic tool that starts from the model code and generates an equivalent model, using a predefined amount of memory for activations, at the cost of a few re-computations. Rockmate automatically detects the structure of computational and data dependencies and rewrites the initial model as a sequence of complex blocks. We show that such a structure is widespread and can be found in many models in the literature (Transformer based models, ResNet, RegNets,...). This structure allows us to solve the problem in a fast and efficient way, using an adaptation of Checkmate (too slow on the whole model but general) at the level of individual blocks and an adaptation of Rotor (fast but limited to sequential models) at the level of the sequence itself. We show through experiments on many models that Rockmate is as fast as Rotor and as efficient as Checkmate, and that it allows in many cases to obtain a significantly lower memory consumption for activations (by a factor of 2 to 5) for a rather negligible overhead (of the order of 10% to 20%). Rockmate is open source and available at https://github.com/topal-team/rockmate

arXiv.org e-Print Archive

Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

Author: Beaumont Olivier
Eyraud-Dubois Lionel
Gusak Julia
Le Hellard Théotime
Zhao Xunyi
Publication venue: HAL CCSD
Publication date: 23/07/2023
Field of study

International audienceWe propose Rockmate to control the memory requirements when training PyTorch DNN models. Rockmate is an automatic tool that starts from the model code and generates an equivalent model, using a predefined amount of memory for activations, at the cost of a few re-computations. Rockmate automatically detects the structure of computational and data dependencies and rewrites the initial model as a sequence of complex blocks. We show that such a structure is widespread and can be found in many models in the literature (Transformer based models, ResNet, RegNets,...). This structure allows us to solve the problem in a fast and efficient way, using an adaptation of Checkmate (too slow on the whole model but general) at the level of individual blocks and an adaptation of Rotor (fast but limited to sequential models) at the level of the sequence itself. We show through experiments on many models that Rockmate is as fast as Rotor and as efficient as Checkmate, and that it allows in many cases to obtain a significantly lower memory consumption for activations (by a factor of 2 to 5) for a rather negligible overhead (of the order of 10% to 20%). Rockmate is open source and available at https://github.com/topal-team/rockmate

INRIA a CCSD electronic archive server

H-Rockmate: Hierarchical Approach for Efficient Re-materialization of Large Neural Networks

Author: Beaumont Olivier
Eyraud-Dubois Lionel
Gusak Julia
Le Hellard Théotime
Li Zhe
Zhao Xunyi
Publication venue: HAL CCSD
Publication date: 01/05/2023
Field of study

Training modern neural networks poses a significant memory challenge, as storing intermediate results during the forward and backward passes demandssubstantial memory resources. To address this issue while maintaining model accuracy, re-materialization techniques have been introduced to recomputeselected intermediate results rather than storing them, thereby adhering to peak memory constraints. The main algorithmic problem is to computea re-materialization schedule that minimizes the computational overhead within a given memory budget. Our H-Rockmate framework builds uponexisting Rockmate solution and overcomes its limitation to work with sequential block structures by proposing a hierarchical approach. Theframework performs an automatic decomposition of the data-flow graph into a hierarchy of small-scale subgraphs, and finds a re-materializationschedule for the whole graph by recursively solving optimization problems for each subgraph. H-Rockmate allows users to transform their PyTorchmodels into nn.Modules that execute forward and backward passes efficiently within the specified memory budget. This framework can handle neuralnetworks with diverse data-flow graph structures, including U-Nets and encoder-decoder Transformers. H-Rockmate consistently outperformsexisting re-materialization approaches both in terms of average training iteration time and peak memory trade-offs, demonstrating superior memoryefficiency in training modern neural networks

INRIA a CCSD electronic archive server