8 research outputs found
Task-specific Objectives of Pre-trained Language Models for Dialogue Adaptation
Pre-trained Language Models (PrLMs) have been widely used as backbones in
lots of Natural Language Processing (NLP) tasks. The common process of
utilizing PrLMs is first pre-training on large-scale general corpora with
task-independent LM training objectives, then fine-tuning on task datasets with
task-specific training objectives. Pre-training in a task-independent way
enables the models to learn language representations, which is universal to
some extent, but fails to capture crucial task-specific features in the
meantime. This will lead to an incompatibility between pre-training and
fine-tuning. To address this issue, we introduce task-specific pre-training on
in-domain task-related corpora with task-specific objectives. This procedure is
placed between the original two stages to enhance the model understanding
capacity of specific tasks. In this work, we focus on Dialogue-related Natural
Language Processing (DrNLP) tasks and design a Dialogue-Adaptive Pre-training
Objective (DAPO) based on some important qualities for assessing dialogues
which are usually ignored by general LM pre-training objectives. PrLMs with
DAPO on a large in-domain dialogue corpus are then fine-tuned for downstream
DrNLP tasks. Experimental results show that models with DAPO surpass those with
general LM pre-training objectives and other strong baselines on downstream
DrNLP tasks
Topic-Aware Multi-turn Dialogue Modeling
In the retrieval-based multi-turn dialogue modeling, it remains a challenge
to select the most appropriate response according to extracting salient
features in context utterances. As a conversation goes on, topic shift at
discourse-level naturally happens through the continuous multi-turn dialogue
context. However, all known retrieval-based systems are satisfied with
exploiting local topic words for context utterance representation but fail to
capture such essential global topic-aware clues at discourse-level. Instead of
taking topic-agnostic n-gram utterance as processing unit for matching purpose
in existing systems, this paper presents a novel topic-aware solution for
multi-turn dialogue modeling, which segments and extracts topic-aware
utterances in an unsupervised way, so that the resulted model is capable of
capturing salient topic shift at discourse-level in need and thus effectively
track topic flow during multi-turn conversation. Our topic-aware modeling is
implemented by a newly proposed unsupervised topic-aware segmentation algorithm
and Topic-Aware Dual-attention Matching (TADAM) Network, which matches each
topic segment with the response in a dual cross-attention way. Experimental
results on three public datasets show TADAM can outperform the state-of-the-art
method, especially by 3.3% on E-commerce dataset that has an obvious topic
shift
Multi-grained Evidence Inference for Multi-choice Reading Comprehension
Multi-choice Machine Reading Comprehension (MRC) is a major and challenging
task for machines to answer questions according to provided options. Answers in
multi-choice MRC cannot be directly extracted in the given passages, and
essentially require machines capable of reasoning from accurate extracted
evidence. However, the critical evidence may be as simple as just one word or
phrase, while it is hidden in the given redundant, noisy passage with multiple
linguistic hierarchies from phrase, fragment, sentence until the entire
passage. We thus propose a novel general-purpose model enhancement which
integrates multi-grained evidence comprehensively, named Multi-grained evidence
inferencer (Mugen), to make up for the inability. Mugen extracts three
different granularities of evidence: coarse-, middle- and fine-grained
evidence, and integrates evidence with the original passages, achieving
significant and consistent performance improvement on four multi-choice MRC
benchmarks.Comment: Accepted by TASLP 2023, vol. 31, pp. 3896-390