300 research outputs found
Towards Better Multi-modal Keyphrase Generation via Visual Entity Enhancement and Multi-granularity Image Noise Filtering
Multi-modal keyphrase generation aims to produce a set of keyphrases that
represent the core points of the input text-image pair. In this regard,
dominant methods mainly focus on multi-modal fusion for keyphrase generation.
Nevertheless, there are still two main drawbacks: 1) only a limited number of
sources, such as image captions, can be utilized to provide auxiliary
information. However, they may not be sufficient for the subsequent keyphrase
generation. 2) the input text and image are often not perfectly matched, and
thus the image may introduce noise into the model. To address these
limitations, in this paper, we propose a novel multi-modal keyphrase generation
model, which not only enriches the model input with external knowledge, but
also effectively filters image noise. First, we introduce external visual
entities of the image as the supplementary input to the model, which benefits
the cross-modal semantic alignment for keyphrase generation. Second, we
simultaneously calculate an image-text matching score and image region-text
correlation scores to perform multi-granularity image noise filtering.
Particularly, we introduce the correlation scores between image regions and
ground-truth keyphrases to refine the calculation of the previously-mentioned
correlation scores. To demonstrate the effectiveness of our model, we conduct
several groups of experiments on the benchmark dataset.
Experimental results and in-depth analyses show that our model achieves the
state-of-the-art performance. Our code is available on
https://github.com/DeepLearnXMU/MM-MKP.Comment: Accepted In Proceedings of the 31st ACM International Conference on
Multimedia (MM' 23
Abstractive Opinion Tagging
In e-commerce, opinion tags refer to a ranked list of tags provided by the
e-commerce platform that reflect characteristics of reviews of an item. To
assist consumers to quickly grasp a large number of reviews about an item,
opinion tags are increasingly being applied by e-commerce platforms. Current
mechanisms for generating opinion tags rely on either manual labelling or
heuristic methods, which is time-consuming and ineffective. In this paper, we
propose the abstractive opinion tagging task, where systems have to
automatically generate a ranked list of opinion tags that are based on, but
need not occur in, a given set of user-generated reviews.
The abstractive opinion tagging task comes with three main challenges: (1)
the noisy nature of reviews; (2) the formal nature of opinion tags vs. the
colloquial language usage in reviews; and (3) the need to distinguish between
different items with very similar aspects. To address these challenges, we
propose an abstractive opinion tagging framework, named AOT-Net, to generate a
ranked list of opinion tags given a large number of reviews. First, a
sentence-level salience estimation component estimates each review's salience
score. Next, a review clustering and ranking component ranks reviews in two
steps: first, reviews are grouped into clusters and ranked by cluster size;
then, reviews within each cluster are ranked by their distance to the cluster
center. Finally, given the ranked reviews, a rank-aware opinion tagging
component incorporates an alignment feature and alignment loss to generate a
ranked list of opinion tags. To facilitate the study of this task, we create
and release a large-scale dataset, called eComTag, crawled from real-world
e-commerce websites. Extensive experiments conducted on the eComTag dataset
verify the effectiveness of the proposed AOT-Net in terms of various evaluation
metrics.Comment: Accepted by WSDM 202
Theme-driven Keyphrase Extraction to Analyze Social Media Discourse
Social media platforms are vital resources for sharing self-reported health
experiences, offering rich data on various health topics. Despite advancements
in Natural Language Processing (NLP) enabling large-scale social media data
analysis, a gap remains in applying keyphrase extraction to health-related
content. Keyphrase extraction is used to identify salient concepts in social
media discourse without being constrained by predefined entity classes. This
paper introduces a theme-driven keyphrase extraction framework tailored for
social media, a pioneering approach designed to capture clinically relevant
keyphrases from user-generated health texts. Themes are defined as broad
categories determined by the objectives of the extraction task. We formulate
this novel task of theme-driven keyphrase extraction and demonstrate its
potential for efficiently mining social media text for the use case of
treatment for opioid use disorder. This paper leverages qualitative and
quantitative analysis to demonstrate the feasibility of extracting actionable
insights from social media data and efficiently extracting keyphrases using
minimally supervised NLP models. Our contributions include the development of a
novel data collection and curation framework for theme-driven keyphrase
extraction and the creation of MOUD-Keyphrase, the first dataset of its kind
comprising human-annotated keyphrases from a Reddit community. We also identify
the scope of minimally supervised NLP models to extract keyphrases from social
media data efficiently. Lastly, we found that a large language model (ChatGPT)
outperforms unsupervised keyphrase extraction models, and we evaluate its
efficacy in this task.Comment: 11 pages, 2 figures, submitted to ICWSM. This version represents a
substantial expansion and refocus of the previous manuscript, including new
experiments, expanded data analysis, and comprehensive discussion
- …