Prior work has demonstrated large language models' (LLMs) potential to
discern statistical tendencies within their pre-training corpora. Despite that,
many examinations of LLMs' knowledge capacity focus on knowledge explicitly
appearing in the training data or implicitly inferable from similar contexts.
How well an LLM captures the corpus-level statistical trends of concepts for
reasoning, especially long-tail ones, is still underexplored. In this study, we
introduce a novel few-shot question-answering task (CPopQA) that examines LLMs'
statistical ranking abilities for long-tail cultural concepts (e.g., holidays),
with a specific focus on these concepts' popularity in the United States and
the United Kingdom, respectively. We curate a dataset containing 459 holidays
across 58 countries, generating a total of 6,000 QA testing pairs. Experiments
on four strong LLMs show that large models are capable of ranking long-tail
cultural concepts regarding their statistical tendency. Notably, GPT-3.5
displayed superior performance and exhibited its potential to identify
geo-cultural proximity across continents