66 research outputs found
TransNFCM: Translation-Based Neural Fashion Compatibility Modeling
Identifying mix-and-match relationships between fashion items is an urgent
task in a fashion e-commerce recommender system. It will significantly enhance
user experience and satisfaction. However, due to the challenges of inferring
the rich yet complicated set of compatibility patterns in a large e-commerce
corpus of fashion items, this task is still underexplored. Inspired by the
recent advances in multi-relational knowledge representation learning and deep
neural networks, this paper proposes a novel Translation-based Neural Fashion
Compatibility Modeling (TransNFCM) framework, which jointly optimizes fashion
item embeddings and category-specific complementary relations in a unified
space via an end-to-end learning manner. TransNFCM places items in a unified
embedding space where a category-specific relation (category-comp-category) is
modeled as a vector translation operating on the embeddings of compatible items
from the corresponding categories. By this way, we not only capture the
specific notion of compatibility conditioned on a specific pair of
complementary categories, but also preserve the global notion of compatibility.
We also design a deep fashion item encoder which exploits the complementary
characteristic of visual and textual features to represent the fashion
products. To the best of our knowledge, this is the first work that uses
category-specific complementary relations to model the category-aware
compatibility between items in a translation-based embedding space. Extensive
experiments demonstrate the effectiveness of TransNFCM over the
state-of-the-arts on two real-world datasets.Comment: Accepted in AAAI 2019 conferenc
Building Emotional Support Chatbots in the Era of LLMs
The integration of emotional support into various conversational scenarios
presents profound societal benefits, such as social interactions, mental health
counseling, and customer service. However, there are unsolved challenges that
hinder real-world applications in this field, including limited data
availability and the absence of well-accepted model training paradigms. This
work endeavors to navigate these challenges by harnessing the capabilities of
Large Language Models (LLMs). We introduce an innovative methodology that
synthesizes human insights with the computational prowess of LLMs to curate an
extensive emotional support dialogue dataset. Our approach is initiated with a
meticulously designed set of dialogues spanning diverse scenarios as generative
seeds. By utilizing the in-context learning potential of ChatGPT, we
recursively generate an ExTensible Emotional Support dialogue dataset, named
ExTES. Following this, we deploy advanced tuning techniques on the LLaMA model,
examining the impact of diverse training strategies, ultimately yielding an LLM
meticulously optimized for emotional support interactions. An exhaustive
assessment of the resultant model showcases its proficiency in offering
emotional support, marking a pivotal step in the realm of emotional support
bots and paving the way for subsequent research and implementations
Conversation Disentanglement with Bi-Level Contrastive Learning
Conversation disentanglement aims to group utterances into detached sessions,
which is a fundamental task in processing multi-party conversations. Existing
methods have two main drawbacks. First, they overemphasize pairwise utterance
relations but pay inadequate attention to the utterance-to-context relation
modeling. Second, huge amount of human annotated data is required for training,
which is expensive to obtain in practice. To address these issues, we propose a
general disentangle model based on bi-level contrastive learning. It brings
closer utterances in the same session while encourages each utterance to be
near its clustered session prototypes in the representation space. Unlike
existing approaches, our disentangle model works in both supervised setting
with labeled data and unsupervised setting when no such data is available. The
proposed method achieves new state-of-the-art performance on both settings
across several public datasets
Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration
Conversational systems based on Large Language Models (LLMs), such as
ChatGPT, show exceptional proficiency in context understanding and response
generation. However, despite their impressive capabilities, they still possess
limitations, such as providing randomly-guessed answers to ambiguous queries or
failing to refuse users' requests, both of which are considered aspects of a
conversational agent's proactivity. This raises the question of whether
LLM-based conversational systems are equipped to handle proactive dialogue
problems. In this work, we conduct a comprehensive analysis of LLM-based
conversational systems, specifically focusing on three aspects of proactive
dialogue systems: clarification, target-guided, and non-collaborative
dialogues. To trigger the proactivity of LLMs, we propose the Proactive
Chain-of-Thought prompting scheme, which augments LLMs with the goal planning
capability over descriptive reasoning chains. Empirical findings are discussed
to promote future studies on LLM-based proactive dialogue systems.Comment: Work in progres
Mix-Initiative Response Generation with Dynamic Prefix Tuning
Mixed initiative serves as one of the key factors in controlling conversation
directions. For a speaker, responding passively or leading proactively would
result in rather different responses. However, most dialogue systems focus on
training a holistic response generation model without any distinction among
different initiatives. It leads to the cross-contamination problem, where the
model confuses different initiatives and generates inappropriate responses.
Moreover, obtaining plenty of human annotations for initiative labels can be
expensive. To address this issue, we propose a general mix-Initiative Dynamic
Prefix Tuning framework (IDPT) to decouple different initiatives from the
generation model, which learns initiative-aware prefixes in both supervised and
unsupervised settings. Specifically, IDPT decouples initiative factors into
different prefix parameters and uses the attention mechanism to adjust the
selection of initiatives in guiding generation dynamically. The prefix
parameters can be tuned towards accurate initiative prediction as well as
mix-initiative response generation. Extensive experiments on two public
dialogue datasets show that the proposed IDPT outperforms previous baselines on
both automatic metrics and human evaluations. It also manages to generate
appropriate responses with manipulated initiatives.Comment: Accepted to the main conference of NAACL 202
Multi-roles affiliation model for general user profiling
National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ
SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation
Knowledge base question generation (KBQG) aims to generate natural language
questions from a set of triplet facts extracted from KB. Existing methods have
significantly boosted the performance of KBQG via pre-trained language models
(PLMs) thanks to the richly endowed semantic knowledge. With the advance of
pre-training techniques, large language models (LLMs) (e.g., GPT-3.5)
undoubtedly possess much more semantic knowledge. Therefore, how to effectively
organize and exploit the abundant knowledge for KBQG becomes the focus of our
study. In this work, we propose SGSH--a simple and effective framework to
Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG. The framework
incorporates "skeleton heuristics", which provides more fine-grained guidance
associated with each input to stimulate LLMs to generate optimal questions,
encompassing essential elements like the question phrase and the auxiliary
verb.More specifically, we devise an automatic data construction strategy
leveraging ChatGPT to construct a skeleton training dataset, based on which we
employ a soft prompting approach to train a BART model dedicated to generating
the skeleton associated with each input. Subsequently, skeleton heuristics are
encoded into the prompt to incentivize GPT-3.5 to generate desired questions.
Extensive experiments demonstrate that SGSH derives the new state-of-the-art
performance on the KBQG tasks.Comment: Accepted by NAACL 2024 Finding
- …