18 research outputs found
Hashing based Answer Selection
Answer selection is an important subtask of question answering (QA), where
deep models usually achieve better performance. Most deep models adopt
question-answer interaction mechanisms, such as attention, to get vector
representations for answers. When these interaction based deep models are
deployed for online prediction, the representations of all answers need to be
recalculated for each question. This procedure is time-consuming for deep
models with complex encoders like BERT which usually have better accuracy than
simple encoders. One possible solution is to store the matrix representation
(encoder output) of each answer in memory to avoid recalculation. But this will
bring large memory cost. In this paper, we propose a novel method, called
hashing based answer selection (HAS), to tackle this problem. HAS adopts a
hashing strategy to learn a binary matrix representation for each answer, which
can dramatically reduce the memory cost for storing the matrix representations
of answers. Hence, HAS can adopt complex encoders like BERT in the model, but
the online prediction of HAS is still fast with a low memory cost. Experimental
results on three popular answer selection datasets show that HAS can outperform
existing models to achieve state-of-the-art performance
Explain by Evidence: An Explainable Memory-based Neural Network for Question Answering
Interpretability and explainability of deep neural networks are challenging
due to their scale, complexity, and the agreeable notions on which the
explaining process rests. Previous work, in particular, has focused on
representing internal components of neural networks through human-friendly
visuals and concepts. On the other hand, in real life, when making a decision,
human tends to rely on similar situations and/or associations in the past.
Hence arguably, a promising approach to make the model transparent is to design
it in a way such that the model explicitly connects the current sample with the
seen ones, and bases its decision on these samples. Grounded on that principle,
we propose in this paper an explainable, evidence-based memory network
architecture, which learns to summarize the dataset and extract supporting
evidences to make its decision. Our model achieves state-of-the-art performance
on two popular question answering datasets (i.e. TrecQA and WikiQA). Via
further analysis, we show that this model can reliably trace the errors it has
made in the validation step to the training instances that might have caused
these errors. We believe that this error-tracing capability provides
significant benefit in improving dataset quality in many applications.Comment: Accepted to COLING 202