218 research outputs found
A Fully Attention-Based Information Retriever
Recurrent neural networks are now the state-of-the-art in natural language
processing because they can build rich contextual representations and process
texts of arbitrary length. However, recent developments on attention mechanisms
have equipped feedforward networks with similar capabilities, hence enabling
faster computations due to the increase in the number of operations that can be
parallelized. We explore this new type of architecture in the domain of
question-answering and propose a novel approach that we call Fully Attention
Based Information Retriever (FABIR). We show that FABIR achieves competitive
results in the Stanford Question Answering Dataset (SQuAD) while having fewer
parameters and being faster at both learning and inference than rival methods.Comment: Accepted for presentation at the International Joint Conference on
Neural Networks (IJCNN) 201
Reinforced Mnemonic Reader for Machine Reading Comprehension
In this paper, we introduce the Reinforced Mnemonic Reader for machine
reading comprehension tasks, which enhances previous attentive readers in two
aspects. First, a reattention mechanism is proposed to refine current
attentions by directly accessing to past attentions that are temporally
memorized in a multi-round alignment architecture, so as to avoid the problems
of attention redundancy and attention deficiency. Second, a new optimization
approach, called dynamic-critical reinforcement learning, is introduced to
extend the standard supervised method. It always encourages to predict a more
acceptable answer so as to address the convergence suppression problem occurred
in traditional reinforcement learning algorithms. Extensive experiments on the
Stanford Question Answering Dataset (SQuAD) show that our model achieves
state-of-the-art results. Meanwhile, our model outperforms previous systems by
over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD
datasets.Comment: Published in 27th International Joint Conference on Artificial
Intelligence (IJCAI), 201
Making Neural QA as Simple as Possible but not Simpler
Recent development of large-scale question answering (QA) datasets triggered
a substantial amount of research into end-to-end neural architectures for QA.
Increasingly complex systems have been conceived without comparison to simpler
neural baseline systems that would justify their complexity. In this work, we
propose a simple heuristic that guides the development of neural baseline
systems for the extractive QA task. We find that there are two ingredients
necessary for building a high-performing neural QA system: first, the awareness
of question words while processing the context and second, a composition
function that goes beyond simple bag-of-words modeling, such as recurrent
neural networks. Our results show that FastQA, a system that meets these two
requirements, can achieve very competitive performance compared with existing
models. We argue that this surprising finding puts results of previous systems
and the complexity of recent QA datasets into perspective
Contextualized Word Representations for Reading Comprehension
Reading a document and extracting an answer to a question about its content
has attracted substantial attention recently. While most work has focused on
the interaction between the question and the document, in this work we evaluate
the importance of context when the question and document are processed
independently. We take a standard neural architecture for this task, and show
that by providing rich contextualized word representations from a large
pre-trained language model as well as allowing the model to choose between
context-dependent and context-independent word representations, we can obtain
dramatic improvements and reach performance comparable to state-of-the-art on
the competitive SQuAD dataset.Comment: 6 pages, 1 figure, NAACL 201
- …