State-of-the-art language generation models can degenerate when applied to
open-ended generation problems such as text completion, story generation, or
dialog modeling. This degeneration usually shows up in the form of incoherence,
lack of vocabulary diversity, and self-repetition or copying from the context.
In this paper, we postulate that ``human-like'' generations usually lie in a
narrow and nearly flat entropy band, and violation of these entropy bounds
correlates with degenerate behavior. Our experiments show that this stable
narrow entropy zone exists across models, tasks, and domains and confirm the
hypothesis that violations of this zone correlate with degeneration. We then
use this insight to propose an entropy-aware decoding algorithm that respects
these entropy bounds resulting in less degenerate, more contextual, and
"human-like" language generation in open-ended text generation settings