Recently Large Language Models (LLMs) have demonstrated their amazing text
understanding and generation capabilities. However, even stronger LLMs may
still learn incorrect knowledge from the training corpus, as well as some
knowledge that is outdated over time. Direct secondary fine-tuning with data
containing new knowledge may be ineffective in updating knowledge due to the
conflict between old and new knowledge. In this paper, we propose a new
paradigm for fine-tuning called F-Learning (Forgetting before Learning), which
is based on parametric arithmetic to achieve forgetting of old knowledge and
learning of new knowledge. Experimental results on two publicly available
datasets demonstrate that our proposed F-Learning can obviously improve the
knowledge updating performance of both full fine-tuning and LoRA fine-tuning.
Moreover, we have also discovered that forgetting old knowledge by subtracting
the parameters of LoRA can achieve a similar effect to subtracting the
parameters of full fine-tuning, and sometimes even surpass it significantly.Comment: 8 pages, 2 figures, 2 table