17 research outputs found
DaNetQA: a yes/no Question Answering Dataset for the Russian Language
DaNetQA, a new question-answering corpus, follows (Clark et. al, 2019)
design: it comprises natural yes/no questions. Each question is paired with a
paragraph from Wikipedia and an answer, derived from the paragraph. The task is
to take both the question and a paragraph as input and come up with a yes/no
answer, i.e. to produce a binary output. In this paper, we present a
reproducible approach to DaNetQA creation and investigate transfer learning
methods for task and language transferring. For task transferring we leverage
three similar sentence modelling tasks: 1) a corpus of paraphrases,
Paraphraser, 2) an NLI task, for which we use the Russian part of XNLI, 3)
another question answering task, SberQUAD. For language transferring we use
English to Russian translation together with multilingual language fine-tuning.Comment: Analysis of Images, Social Networks and Texts - 9 th International
Conference, AIST 2020, Skolkovo, Russia, October 15-16, 2020, Revised
Selected Papers. Lecture Notes in Computer Science
(https://dblp.org/db/series/lncs/index.html), Springer 202
A Pre-training Based Personalized Dialogue Generation Model with Persona-sparse Data
Endowing dialogue systems with personas is essential to deliver more
human-like conversations. However, this problem is still far from well explored
due to the difficulties of both embodying personalities in natural languages
and the persona sparsity issue observed in most dialogue corpora. This paper
proposes a pre-training based personalized dialogue model that can generate
coherent responses using persona-sparse dialogue data. In this method, a
pre-trained language model is used to initialize an encoder and decoder, and
personal attribute embeddings are devised to model richer dialogue contexts by
encoding speakers' personas together with dialogue histories. Further, to
incorporate the target persona in the decoding process and to balance its
contribution, an attention routing structure is devised in the decoder to merge
features extracted from the target persona and dialogue contexts using
dynamically predicted weights. Our model can utilize persona-sparse dialogues
in a unified manner during the training process, and can also control the
amount of persona-related features to exhibit during the inference process.
Both automatic and manual evaluation demonstrates that the proposed model
outperforms state-of-the-art methods for generating more coherent and persona
consistent responses with persona-sparse data.Comment: Long paper accepted at AAAI 202