Large language models (LLMs) can generate intermediate reasoning steps. To
elicit the reliable reasoning, the common practice is to employ few-shot
chain-of-thought prompting, where several in-context demonstrations for
reasoning are prepended to the question. However, such chain-of-thought
examples are expensive to craft, especially for professional domains, and can
have high variance depending on human annotators. Therefore, this work
investigates whether LLMs can teach themselves to reason without human-crafted
demonstrations. We propose SELF-EXPLAIN to generate CoT examples by LLMs
inspired by "encoding specificity" in human memory retrieval. We find using
self-explanations makes LLMs more confident, more calibrated and less biased
when answering complex questions. Moreover, we find prompting with
self-explanations can even significantly outperform using human-crafted CoTs on
several complex question answering dataset.Comment: Workshop on robustness of zero/few-shot learning in foundation models
@ NeurIPS 202