13 research outputs found
Word Embedding based Correlation Model for Question/Answer Matching
With the development of community based question answering (Q&A) services, a
large scale of Q&A archives have been accumulated and are an important
information and knowledge resource on the web. Question and answer matching has
been attached much importance to for its ability to reuse knowledge stored in
these systems: it can be useful in enhancing user experience with recurrent
questions. In this paper, we try to improve the matching accuracy by overcoming
the lexical gap between question and answer pairs. A Word Embedding based
Correlation (WEC) model is proposed by integrating advantages of both the
translation model and word embedding, given a random pair of words, WEC can
score their co-occurrence probability in Q&A pairs and it can also leverage the
continuity and smoothness of continuous space word representation to deal with
new pairs of words that are rare in the training parallel text. An experimental
study on Yahoo! Answers dataset and Baidu Zhidao dataset shows this new
method's promising potential.Comment: 8 pages, 2 figure
An Unsupervised Model with Attention Autoencoders for Question Retrieval
Question retrieval is a crucial subtask for community question answering.
Previous research focus on supervised models which depend heavily on training
data and manual feature engineering. In this paper, we propose a novel
unsupervised framework, namely reduced attentive matching network (RAMN), to
compute semantic matching between two questions. Our RAMN integrates together
the deep semantic representations, the shallow lexical mismatching information
and the initial rank produced by an external search engine. For the first time,
we propose attention autoencoders to generate semantic representations of
questions. In addition, we employ lexical mismatching to capture surface
matching between two questions, which is derived from the importance of each
word in a question. We conduct experiments on the open CQA datasets of
SemEval-2016 and SemEval-2017. The experimental results show that our
unsupervised model obtains comparable performance with the state-of-the-art
supervised methods in SemEval-2016 Task 3, and outperforms the best system in
SemEval-2017 Task 3 by a wide margin
Review-guided Helpful Answer Identification in E-commerce
Product-specific community question answering platforms can greatly help
address the concerns of potential customers. However, the user-provided answers
on such platforms often vary a lot in their qualities. Helpfulness votes from
the community can indicate the overall quality of the answer, but they are
often missing. Accurately predicting the helpfulness of an answer to a given
question and thus identifying helpful answers is becoming a demanding need.
Since the helpfulness of an answer depends on multiple perspectives instead of
only topical relevance investigated in typical QA tasks, common answer
selection algorithms are insufficient for tackling this task. In this paper, we
propose the Review-guided Answer Helpfulness Prediction (RAHP) model that not
only considers the interactions between QA pairs but also investigates the
opinion coherence between the answer and crowds' opinions reflected in the
reviews, which is another important factor to identify helpful answers.
Moreover, we tackle the task of determining opinion coherence as a language
inference problem and explore the utilization of pre-training strategy to
transfer the textual inference knowledge obtained from a specifically designed
trained network. Extensive experiments conducted on real-world data across
seven product categories show that our proposed model achieves superior
performance on the prediction task.Comment: Accepted by WWW202
Multitask Learning with Deep Neural Networks for Community Question Answering
In this paper, we developed a deep neural network (DNN) that learns to solve simultaneously the three tasks of the cQA challenge proposed by the SemEval-2016 Task 3, i.e., question-comment similarity, question-question similarity and new question-comment similarity. The latter is the main task, which can exploit the previous two for achieving better results. Our DNN is trained jointly on all the three cQA tasks and learns to encode questions and comments into a single vector representation shared across the multiple tasks. The results on the official challenge test set show that our approach produces higher accuracy and faster convergence rates than the individual neural networks. Additionally, our method, which does not use any manual feature engineering, approaches the state of the art established with methods that make heavy use of it