Recently, Large Language Models (LLMs) have shown impressive results in code
generation. However, existing decoding strategies are designed for Natural
Language (NL) generation, overlooking the differences between NL and
programming languages (PL). Due to this oversight, a better decoding strategy
for code generation remains an open question. In this paper, we conduct the
first systematic study to explore a decoding strategy specialized in code
generation. With an analysis of loss distributions of code tokens, we find that
code tokens can be divided into two categories: challenging tokens that are
difficult to predict and confident tokens that can be easily inferred. Among
them, the challenging tokens mainly appear at the beginning of a code block.
Inspired by the above findings, we propose a simple yet effective method:
Adaptive Temperature (AdapT) sampling, which dynamically adjusts the
temperature coefficient when decoding different tokens. We apply a larger
temperature when sampling for challenging tokens, allowing LLMs to explore
diverse choices. We employ a smaller temperature for confident tokens avoiding
the influence of tail randomness noises. We apply AdapT sampling to LLMs with
different sizes and conduct evaluations on two popular datasets. Results show
that AdapT sampling significantly outperforms state-of-the-art decoding
strategy