263 research outputs found
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues
Building an intelligent dialogue system with the ability to select a proper
response according to a multi-turn context is a great challenging task.
Existing studies focus on building a context-response matching model with
various neural architectures or PLMs and typically learning with a single
response prediction task. These approaches overlook many potential training
signals contained in dialogue data, which might be beneficial for context
understanding and produce better features for response prediction. Besides, the
response retrieved from existing dialogue systems supervised by the
conventional way still faces some critical challenges, including incoherence
and inconsistency. To address these issues, in this paper, we propose learning
a context-response matching model with auxiliary self-supervised tasks designed
for the dialogue data based on pre-trained language models. Specifically, we
introduce four self-supervised tasks including next session prediction,
utterance restoration, incoherence detection and consistency discrimination,
and jointly train the PLM-based response selection model with these auxiliary
tasks in a multi-task manner. By this means, the auxiliary tasks can guide the
learning of the matching model to achieve a better local optimum and select a
more proper response. Experiment results on two benchmarks indicate that the
proposed auxiliary self-supervised tasks bring significant improvement for
multi-turn response selection in retrieval-based dialogues, and our model
achieves new state-of-the-art results on both datasets.Comment: 10 page
Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce
With the prosperous of cross-border e-commerce, there is an urgent demand for
designing intelligent approaches for assisting e-commerce sellers to offer
local products for consumers from all over the world. In this paper, we explore
a new task of cross-lingual information retrieval, i.e., cross-lingual
set-to-description retrieval in cross-border e-commerce, which involves
matching product attribute sets in the source language with persuasive product
descriptions in the target language. We manually collect a new and high-quality
paired dataset, where each pair contains an unordered product attribute set in
the source language and an informative product description in the target
language. As the dataset construction process is both time-consuming and
costly, the new dataset only comprises of 13.5k pairs, which is a low-resource
setting and can be viewed as a challenging testbed for model development and
evaluation in cross-border e-commerce. To tackle this cross-lingual
set-to-description retrieval task, we propose a novel cross-lingual matching
network (CLMN) with the enhancement of context-dependent cross-lingual mapping
upon the pre-trained monolingual BERT representations. Experimental results
indicate that our proposed CLMN yields impressive results on the challenging
task and the context-dependent cross-lingual mapping on BERT yields noticeable
improvement over the pre-trained multi-lingual BERT model.Comment: AAAI 202
Tri-Attention: Explicit Context-Aware Attention Mechanism for Natural Language Processing
In natural language processing (NLP), the context of a word or sentence plays
an essential role. Contextual information such as the semantic representation
of a passage or historical dialogue forms an essential part of a conversation
and a precise understanding of the present phrase or sentence. However, the
standard attention mechanisms typically generate weights using query and key
but ignore context, forming a Bi-Attention framework, despite their great
success in modeling sequence alignment. This Bi-Attention mechanism does not
explicitly model the interactions between the contexts, queries and keys of
target sequences, missing important contextual information and resulting in
poor attention performance. Accordingly, a novel and general triple-attention
(Tri-Attention) framework expands the standard Bi-Attention mechanism and
explicitly interacts query, key, and context by incorporating context as the
third dimension in calculating relevance scores. Four variants of Tri-Attention
are generated by expanding the two-dimensional vector-based additive,
dot-product, scaled dot-product, and bilinear operations in Bi-Attention to the
tensor operations for Tri-Attention. Extensive experiments on three NLP tasks
demonstrate that Tri-Attention outperforms about 30 state-of-the-art
non-attention, standard Bi-Attention, contextual Bi-Attention approaches and
pretrained neural language models1
Open-Retrieval Conversational Question Answering
Conversational search is one of the ultimate goals of information retrieval.
Recent research approaches conversational search by simplified settings of
response ranking and conversational question answering, where an answer is
either selected from a given candidate set or extracted from a given passage.
These simplifications neglect the fundamental role of retrieval in
conversational search. To address this limitation, we introduce an
open-retrieval conversational question answering (ORConvQA) setting, where we
learn to retrieve evidence from a large collection before extracting answers,
as a further step towards building functional conversational search systems. We
create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an
end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader
that are all based on Transformers. Our extensive experiments on OR-QuAC
demonstrate that a learnable retriever is crucial for ORConvQA. We further show
that our system can make a substantial improvement when we enable history
modeling in all system components. Moreover, we show that the reranker
component contributes to the model performance by providing a regularization
effect. Finally, further in-depth analyses are performed to provide new
insights into ORConvQA.Comment: Accepted to SIGIR'2
- …