1,241 research outputs found
Large Scale Question Paraphrase Retrieval with Smoothed Deep Metric Learning
The goal of a Question Paraphrase Retrieval (QPR) system is to retrieve
equivalent questions that result in the same answer as the original question.
Such a system can be used to understand and answer rare and noisy
reformulations of common questions by mapping them to a set of canonical forms.
This has large-scale applications for community Question Answering (cQA) and
open-domain spoken language question answering systems. In this paper we
describe a new QPR system implemented as a Neural Information Retrieval (NIR)
system consisting of a neural network sentence encoder and an approximate
k-Nearest Neighbour index for efficient vector retrieval. We also describe our
mechanism to generate an annotated dataset for question paraphrase retrieval
experiments automatically from question-answer logs via distant supervision. We
show that the standard loss function in NIR, triplet loss, does not perform
well with noisy labels. We propose smoothed deep metric loss (SDML) and with
our experiments on two QPR datasets we show that it significantly outperforms
triplet loss in the noisy label setting
- …