162 research outputs found
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Despite the seeming success of contemporary grounded text generation systems,
they often tend to generate factually inconsistent text with respect to their
input. This phenomenon is emphasized in tasks like summarization, in which the
generated summaries should be corroborated by their source article. In this
work, we leverage recent progress on textual entailment models to directly
address this problem for abstractive summarization systems. We use
reinforcement learning with reference-free, textual entailment rewards to
optimize for factual consistency and explore the ensuing trade-offs, as
improved consistency may come at the cost of less informative or more
extractive summaries. Our results, according to both automatic metrics and
human evaluation, show that our method considerably improves the faithfulness,
salience, and conciseness of the generated summaries.Comment: ACL 202
Attractive or Faithful? Popularity-Reinforced Learning for Inspired Headline Generation
With the rapid proliferation of online media sources and published news,
headlines have become increasingly important for attracting readers to news
articles, since users may be overwhelmed with the massive information. In this
paper, we generate inspired headlines that preserve the nature of news articles
and catch the eye of the reader simultaneously. The task of inspired headline
generation can be viewed as a specific form of Headline Generation (HG) task,
with the emphasis on creating an attractive headline from a given news article.
To generate inspired headlines, we propose a novel framework called
POpularity-Reinforced Learning for inspired Headline Generation (PORL-HG).
PORL-HG exploits the extractive-abstractive architecture with 1) Popular Topic
Attention (PTA) for guiding the extractor to select the attractive sentence
from the article and 2) a popularity predictor for guiding the abstractor to
rewrite the attractive sentence. Moreover, since the sentence selection of the
extractor is not differentiable, techniques of reinforcement learning (RL) are
utilized to bridge the gap with rewards obtained from a popularity score
predictor. Through quantitative and qualitative experiments, we show that the
proposed PORL-HG significantly outperforms the state-of-the-art headline
generation models in terms of attractiveness evaluated by both human (71.03%)
and the predictor (at least 27.60%), while the faithfulness of PORL-HG is also
comparable to the state-of-the-art generation model.Comment: AAAI 202
Reinforcement Learning for Generative AI: A Survey
Deep Generative AI has been a long-standing essential topic in the machine
learning community, which can impact a number of application areas like text
generation and computer vision. The major paradigm to train a generative model
is maximum likelihood estimation, which pushes the learner to capture and
approximate the target data distribution by decreasing the divergence between
the model distribution and the target distribution. This formulation
successfully establishes the objective of generative tasks, while it is
incapable of satisfying all the requirements that a user might expect from a
generative model. Reinforcement learning, serving as a competitive option to
inject new training signals by creating new objectives that exploit novel
signals, has demonstrated its power and flexibility to incorporate human
inductive bias from multiple angles, such as adversarial learning,
hand-designed rules and learned reward model to build a performant model.
Thereby, reinforcement learning has become a trending research field and has
stretched the limits of generative AI in both model design and application. It
is reasonable to summarize and conclude advances in recent years with a
comprehensive review. Although there are surveys in different application areas
recently, this survey aims to shed light on a high-level review that spans a
range of application areas. We provide a rigorous taxonomy in this area and
make sufficient coverage on various models and applications. Notably, we also
surveyed the fast-developing large language model area. We conclude this survey
by showing the potential directions that might tackle the limit of current
models and expand the frontiers for generative AI
- …