28 research outputs found
Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models
Keyphrase Generation (KPG) is a longstanding task in NLP with widespread
applications. The advent of sequence-to-sequence (seq2seq) pre-trained language
models (PLMs) has ushered in a transformative era for KPG, yielding promising
performance improvements. However, many design decisions remain unexplored and
are often made arbitrarily. This paper undertakes a systematic analysis of the
influence of model selection and decoding strategies on PLM-based KPG. We begin
by elucidating why seq2seq PLMs are apt for KPG, anchored by an
attention-driven hypothesis. We then establish that conventional wisdom for
selecting seq2seq PLMs lacks depth: (1) merely increasing model size or
performing task-specific adaptation is not parameter-efficient; (2) although
combining in-domain pre-training with task adaptation benefits KPG, it does
partially hinder generalization. Regarding decoding, we demonstrate that while
greedy search achieves strong F1 scores, it lags in recall compared with
sampling-based methods. Based on these insights, we propose DeSel, a
likelihood-based decode-select algorithm for seq2seq PLMs. DeSel improves
greedy search by an average of 4.7% semantic F1 across five datasets. Our
collective findings pave the way for deeper future investigations into
PLM-based KPG.Comment: EMNLP 2023 camera read
BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla
This work presents BanglaNLG, a comprehensive benchmark for evaluating
natural language generation (NLG) models in Bangla, a widely spoken yet
low-resource language. We aggregate six challenging conditional text generation
tasks under the BanglaNLG benchmark, introducing a new dataset on dialogue
generation in the process. Then, using a clean corpus of 27.5 GB of Bangla
data, we pretrain BanglaT5, a sequence-to-sequence Transformer model for
Bangla. BanglaT5 achieves state-of-the-art performance in all of these tasks,
outperforming several multilingual models by up to 9% absolute gain and 32%
relative gain. We are making the new dataset, the BanglaT5 language model, and
a leaderboard publicly available at https://github.com/csebuetnlp/BanglaNLG in
the hope of advancing future research and evaluation on Bangla NLG.Comment: Accepted at the Findings of EACL 202