2,756 research outputs found
The Curious Case of Neural Text Degeneration
Despite considerable advances in neural language modeling, it remains an open question what the best decoding strategy is for text generation from a language model (e.g. to generate a story). The counter-intuitive empirical observation is that even though the use of likelihood as training objective leads to high quality models for a broad range of language understanding tasks, maximization-based decoding methods such as beam search lead to degeneration — output text that is bland, incoherent, or gets stuck in repetitive loops.
To address this we propose Nucleus Sampling, a simple but effective method to draw considerably higher quality text out of neural language models than previous decoding strategies. Our approach avoids text degeneration by truncating the unreliable tail of the probability distribution, sampling from the dynamic nucleus of tokens containing the vast majority of the probability mass.
To properly examine current maximization-based and stochastic decoding methods, we compare generations from each of these methods to the distribution of human text along several axes such as likelihood, diversity, and repetition. Our results show that (1) maximization is an inappropriate decoding objective for open-ended text generation, (2) the probability distributions of the best current language models have an unreliable tail which needs to be truncated during generation and (3) Nucleus Sampling is currently the best available decoding strategy for generating long-form text that is both high-quality — as measured by human evaluation — and as diverse as human-written text
TextGAIL: Generative Adversarial Imitation Learning for Text Generation
Generative Adversarial Networks (GANs) for text generation have recently
received many criticisms, as they perform worse than their MLE counterparts. We
suspect previous text GANs' inferior performance is due to the lack of a
reliable guiding signal in their discriminators. To address this problem, we
propose a generative adversarial imitation learning framework for text
generation that uses large pre-trained language models to provide more reliable
reward guidance. Our approach uses contrastive discriminator, and proximal
policy optimization (PPO) to stabilize and improve text generation performance.
For evaluation, we conduct experiments on a diverse set of unconditional and
conditional text generation tasks. Experimental results show that TextGAIL
achieves better performance in terms of both quality and diversity than the MLE
baseline. We also validate our intuition that TextGAIL's discriminator
demonstrates the capability of providing reasonable rewards with an additional
task.Comment: AAAI 202
Fidelity-Enriched Contrastive Search: Reconciling the Faithfulness-Diversity Trade-Off in Text Generation
In this paper, we address the hallucination problem commonly found in natural
language generation tasks. Language models often generate fluent and convincing
content but can lack consistency with the provided source, resulting in
potential inaccuracies. We propose a new decoding method called
Fidelity-Enriched Contrastive Search (FECS), which augments the contrastive
search framework with context-aware regularization terms. FECS promotes tokens
that are semantically similar to the provided source while penalizing
repetitiveness in the generated text. We demonstrate its effectiveness across
two tasks prone to hallucination: abstractive summarization and dialogue
generation. Results show that FECS consistently enhances faithfulness across
various language model sizes while maintaining output diversity comparable to
well-performing decoding algorithms.Comment: Accepted as a short paper at EMNLP 202
An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation
In the study, we empirically compare the two recently proposed decoding
methods, i.e. Contrastive Search (CS) and Contrastive Decoding (CD), for
open-ended text generation. The automatic evaluation results suggest that,
while CS performs worse than CD on the MAUVE metric, it substantially surpasses
CD on the diversity and coherence metrics. More notably, extensive human
evaluations across three different domains demonstrate that human annotators
are universally more in favor of CS over CD with substantial margins.
The contradicted results between MAUVE and human evaluations reveal that
MAUVE does not accurately reflect human preferences. Therefore, we call upon
the research community to develop better evaluation metrics for open-ended text
generation. To ensure the reproducibility of our work, we have open-sourced all
our code, evaluation results, as well as human annotations at
https://github.com/yxuansu/Contrastive_Search_versus_Contrastive_Decoding.Comment: Technical report with 9 pages, 5 tables, and 6 figure
Phylogeny, Psychology, and the Vicissitudes of Human Development: The Anxiety of Atavism
This cross-disciplinary dissertation provides a missing intellectual history of an ostensibly dead idea. Once widely held and no less elegant for its obsolescence, the principle of biogenetic recapitulation is best remembered by its defining mantra, “ontogeny recapitulates phylogeny.” Among psychologists and sociologists as well as embryologists, the notion that the development of any individual organism repeats in compressed, miniaturized form the entire history of its species enjoyed broad (if not uncontested) acceptance through the early twentieth century. The author reexamines the origins of this theory in the work of Charles Darwin and Ernst Haeckel, and traces its influence in psychology from early psychoanalytic theory to late twentieth-century evolutionary neuroscience. It is argued that recapitulationism (or the “biogenetic law”) appealed to psychological theorists for its moral and affective implications, rather than its scientific merit or usefulness in generating testable hypotheses. Central to this study is an emphasis on the use of recapitulationism to critique doctrines of evolutionary and social progress. The dissertation concludes that for contemporary neuroscientists no less than early psychoanalysts, the ghost of phylogeny, or the evolutionary past, is most often summoned to explain worrisome and unexpected disruptions in normal human development—especially when those disruptions emerge within what is taken to be the height of modernity
Look-back Decoding for Open-Ended Text Generation
Given a prefix (context), open-ended generation aims to decode texts that are
coherent, which don't abruptly drift from previous topics, and informative,
which don't suffer from undesired repetitions. In this paper, we propose
Look-back, an improved decoding algorithm that leverages the Kullback-Leibler
divergence to track the distribution distance between current and historical
decoding steps. Thus Look-back can automatically predict potential repetitive
phrase and topic drift, and remove tokens that may cause the failure modes,
restricting the next token probability distribution within a plausible distance
to the history. We perform decoding experiments on document continuation and
story generation, and demonstrate that Look-back is able to generate more
fluent and coherent text, outperforming other strong decoding methods
significantly in both automatic and human evaluations
- …