Recent works in open-domain question answering (QA) have explored generating
context passages from large language models (LLMs), replacing the traditional
retrieval step in the QA pipeline. However, it is not well understood why
generated passages can be more effective than retrieved ones. This study
revisits the conventional formulation of QA and introduces the concept of
knowledge corpus error. This error arises when the knowledge corpus used for
retrieval is only a subset of the entire string space, potentially excluding
more helpful passages that exist outside the corpus. LLMs may mitigate this
shortcoming by generating passages in a larger space. We come up with an
experiment of paraphrasing human-annotated gold context using LLMs to observe
knowledge corpus error empirically. Our results across three QA benchmarks
reveal an increased performance (10% - 13%) when using paraphrased passage,
indicating a signal for the existence of knowledge corpus error. Our code is
available at https://github.com/xfactlab/emnlp2023-knowledge-corpus-errorComment: Findings of EMNLP 202