2 research outputs found
DaNetQA: a yes/no Question Answering Dataset for the Russian Language
DaNetQA, a new question-answering corpus, follows (Clark et. al, 2019)
design: it comprises natural yes/no questions. Each question is paired with a
paragraph from Wikipedia and an answer, derived from the paragraph. The task is
to take both the question and a paragraph as input and come up with a yes/no
answer, i.e. to produce a binary output. In this paper, we present a
reproducible approach to DaNetQA creation and investigate transfer learning
methods for task and language transferring. For task transferring we leverage
three similar sentence modelling tasks: 1) a corpus of paraphrases,
Paraphraser, 2) an NLI task, for which we use the Russian part of XNLI, 3)
another question answering task, SberQUAD. For language transferring we use
English to Russian translation together with multilingual language fine-tuning.Comment: Analysis of Images, Social Networks and Texts - 9 th International
Conference, AIST 2020, Skolkovo, Russia, October 15-16, 2020, Revised
Selected Papers. Lecture Notes in Computer Science
(https://dblp.org/db/series/lncs/index.html), Springer 202