Large Language Models (LLMs) have shown impressive abilities in various
tasks. However, fundamentally improving them depends on high-quality datasets
or computationally expensive fine-tuning. On the contrary, humans can easily
improve themselves by self-thinking and memory, without external resources. In
this paper, we propose a framework, MoT, to let the LLM self-improve through
Memory-of-Thought, without annotated datasets and parameter updates.
Specifically, MoT is divided into two stages: 1. before the test stage, the LLM
pre-thinks on the unlabeled dataset and saves the high-confidence thoughts as
external memory; 2. During the test stage, given a test question, the LLM
recalls relevant memory to help itself reason and answer it. Experimental
results show that MoT can help ChatGPT significantly improve its abilities in
arithmetic reasoning, commonsense reasoning, factual reasoning, and natural
language inference. Further analyses show that each component contributes
critically to the improvements and MoT can lead to consistent improvements
across various CoT methods and LLMs.Comment: Accepted to appear at EMNLP 202