1,285 research outputs found
UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition
Multimodal sentiment analysis (MSA) and emotion recognition in conversation
(ERC) are key research topics for computers to understand human behaviors. From
a psychological perspective, emotions are the expression of affect or feelings
during a short period, while sentiments are formed and held for a longer
period. However, most existing works study sentiment and emotion separately and
do not fully exploit the complementary knowledge behind the two. In this paper,
we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that
unifies MSA and ERC tasks from features, labels, and models. We perform
modality fusion at the syntactic and semantic levels and introduce contrastive
learning between modalities and samples to better capture the difference and
consistency between sentiments and emotions. Experiments on four public
benchmark datasets, MOSI, MOSEI, MELD, and IEMOCAP, demonstrate the
effectiveness of the proposed method and achieve consistent improvements
compared with state-of-the-art methods.Comment: Accepted to EMNLP 2022 main conferenc
UniSA: Unified Generative Framework for Sentiment Analysis
Sentiment analysis is a crucial task that aims to understand people's
emotional states and predict emotional categories based on multimodal
information. It consists of several subtasks, such as emotion recognition in
conversation (ERC), aspect-based sentiment analysis (ABSA), and multimodal
sentiment analysis (MSA). However, unifying all subtasks in sentiment analysis
presents numerous challenges, including modality alignment, unified
input/output forms, and dataset bias. To address these challenges, we propose a
Task-Specific Prompt method to jointly model subtasks and introduce a
multimodal generative framework called UniSA. Additionally, we organize the
benchmark datasets of main subtasks into a new Sentiment Analysis Evaluation
benchmark, SAEval. We design novel pre-training tasks and training methods to
enable the model to learn generic sentiment knowledge among subtasks to improve
the model's multimodal sentiment perception ability. Our experimental results
show that UniSA performs comparably to the state-of-the-art on all subtasks and
generalizes well to various subtasks in sentiment analysis.Comment: Accepted to ACM MM 202
Self-Explanation Prompting Improves Dialogue Understanding in Large Language Models
Task-oriented dialogue (TOD) systems facilitate users in executing various
activities via multi-turn dialogues, but Large Language Models (LLMs) often
struggle to comprehend these intricate contexts. In this study, we propose a
novel "Self-Explanation" prompting strategy to enhance the comprehension
abilities of LLMs in multi-turn dialogues. This task-agnostic approach requires
the model to analyze each dialogue utterance before task execution, thereby
improving performance across various dialogue-centric tasks. Experimental
results from six benchmark datasets confirm that our method consistently
outperforms other zero-shot prompts and matches or exceeds the efficacy of
few-shot prompts, demonstrating its potential as a powerful tool in enhancing
LLMs' comprehension in complex dialogue tasks
Improving Factual Consistency of Text Summarization by Adversarially Decoupling Comprehension and Embellishment Abilities of LLMs
Despite the recent progress in text summarization made by large language
models (LLMs), they often generate summaries that are factually inconsistent
with original articles, known as "hallucinations" in text generation. Unlike
previous small models (e.g., BART, T5), current LLMs make fewer silly mistakes
but more sophisticated ones, such as imposing cause and effect, adding false
details, overgeneralizing, etc. These hallucinations are challenging to detect
through traditional methods, which poses great challenges for improving the
factual consistency of text summarization. In this paper, we propose an
adversarially DEcoupling method to disentangle the Comprehension and
EmbellishmeNT abilities of LLMs (DECENT). Furthermore, we adopt a probing-based
efficient training to cover the shortage of sensitivity for true and false in
the training process of LLMs. In this way, LLMs are less confused about
embellishing and understanding; thus, they can execute the instructions more
accurately and have enhanced abilities to distinguish hallucinations.
Experimental results show that DECENT significantly improves the reliability of
text summarization based on LLMs
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents
Task-oriented dialogue (TOD) models have made significant progress in recent
years. However, previous studies primarily focus on datasets written by
annotators, which has resulted in a gap between academic research and
real-world spoken conversation scenarios. While several small-scale spoken TOD
datasets are proposed to address robustness issues such as ASR errors, they
ignore the unique challenges in spoken conversation. To tackle the limitations,
we introduce SpokenWOZ, a large-scale speech-text dataset for spoken TOD,
containing 8 domains, 203k turns, 5.7k dialogues and 249 hours of audios from
human-to-human spoken conversations. SpokenWOZ further incorporates common
spoken characteristics such as word-by-word processing and reasoning in spoken
language. Based on these characteristics, we present cross-turn slot and
reasoning slot detection as new challenges. We conduct experiments on various
baselines, including text-modal models, newly proposed dual-modal models, and
LLMs, e.g., ChatGPT. The results show that the current models still have
substantial room for improvement in spoken conversation, where the most
advanced dialogue state tracker only achieves 25.65% in joint goal accuracy and
the SOTA end-to-end model only correctly completes the user request in 52.1% of
dialogues. The dataset, code, and leaderboard are available:
https://spokenwoz.github.io/SpokenWOZ-github.io/
E-Beam Patterned Gold Nanodot Arrays on Optical Fiber Tips for Localized Surface Plasmon Resonance Biochemical Sensing
Electron beam lithography (EBL) was used to directly pattern periodic gold nanodot arrays on optical fiber tips. Localized surface plasmon resonance of the E-beam patterned gold nanodot arrays on optical fiber tips was utilized for biochemical sensing. The advantage of the optical fiber based localized surface plasmon resonance (LSPR) sensors is the convenience to work with and work in harsh environments. An optical fiber tip LSPR refractive index sensor of 196 nm per refractive index unit (RIU) sensitivity has been demonstrated. The affinity sensing property of the fiber tip sensor was demonstrated using biotin/streptavidin as the receptor/analyte. The detection limit for streptavidin was determined to be 6 pM
- …