281 research outputs found
DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases
Keyphrase extraction from documents is useful to a variety of applications
such as information retrieval and document summarization. This paper presents
an end-to-end method called DivGraphPointer for extracting a set of diversified
keyphrases from a document. DivGraphPointer combines the advantages of
traditional graph-based ranking methods and recent neural network-based
approaches. Specifically, given a document, a word graph is constructed from
the document based on word proximity and is encoded with graph convolutional
networks, which effectively capture document-level word salience by modeling
long-range dependency between words in the document and aggregating multiple
appearances of identical words into one node. Furthermore, we propose a
diversified point network to generate a set of diverse keyphrases out of the
word graph in the decoding process. Experimental results on five benchmark data
sets show that our proposed method significantly outperforms the existing
state-of-the-art approaches.Comment: Accepted to SIGIR 201
ChatGPT vs State-of-the-Art Models: A Benchmarking Study in Keyphrase Generation Task
Transformer-based language models, including ChatGPT, have demonstrated
exceptional performance in various natural language generation tasks. However,
there has been limited research evaluating ChatGPT's keyphrase generation
ability, which involves identifying informative phrases that accurately reflect
a document's content. This study seeks to address this gap by comparing
ChatGPT's keyphrase generation performance with state-of-the-art models, while
also testing its potential as a solution for two significant challenges in the
field: domain adaptation and keyphrase generation from long documents. We
conducted experiments on six publicly available datasets from scientific
articles and news domains, analyzing performance on both short and long
documents. Our results show that ChatGPT outperforms current state-of-the-art
models in all tested datasets and environments, generating high-quality
keyphrases that adapt well to diverse domains and document lengths
- …