13 research outputs found

    Word Embedding based Correlation Model for Question/Answer Matching

    Full text link
    With the development of community based question answering (Q&A) services, a large scale of Q&A archives have been accumulated and are an important information and knowledge resource on the web. Question and answer matching has been attached much importance to for its ability to reuse knowledge stored in these systems: it can be useful in enhancing user experience with recurrent questions. In this paper, we try to improve the matching accuracy by overcoming the lexical gap between question and answer pairs. A Word Embedding based Correlation (WEC) model is proposed by integrating advantages of both the translation model and word embedding, given a random pair of words, WEC can score their co-occurrence probability in Q&A pairs and it can also leverage the continuity and smoothness of continuous space word representation to deal with new pairs of words that are rare in the training parallel text. An experimental study on Yahoo! Answers dataset and Baidu Zhidao dataset shows this new method's promising potential.Comment: 8 pages, 2 figure

    An Unsupervised Model with Attention Autoencoders for Question Retrieval

    Full text link
    Question retrieval is a crucial subtask for community question answering. Previous research focus on supervised models which depend heavily on training data and manual feature engineering. In this paper, we propose a novel unsupervised framework, namely reduced attentive matching network (RAMN), to compute semantic matching between two questions. Our RAMN integrates together the deep semantic representations, the shallow lexical mismatching information and the initial rank produced by an external search engine. For the first time, we propose attention autoencoders to generate semantic representations of questions. In addition, we employ lexical mismatching to capture surface matching between two questions, which is derived from the importance of each word in a question. We conduct experiments on the open CQA datasets of SemEval-2016 and SemEval-2017. The experimental results show that our unsupervised model obtains comparable performance with the state-of-the-art supervised methods in SemEval-2016 Task 3, and outperforms the best system in SemEval-2017 Task 3 by a wide margin

    Review-guided Helpful Answer Identification in E-commerce

    Full text link
    Product-specific community question answering platforms can greatly help address the concerns of potential customers. However, the user-provided answers on such platforms often vary a lot in their qualities. Helpfulness votes from the community can indicate the overall quality of the answer, but they are often missing. Accurately predicting the helpfulness of an answer to a given question and thus identifying helpful answers is becoming a demanding need. Since the helpfulness of an answer depends on multiple perspectives instead of only topical relevance investigated in typical QA tasks, common answer selection algorithms are insufficient for tackling this task. In this paper, we propose the Review-guided Answer Helpfulness Prediction (RAHP) model that not only considers the interactions between QA pairs but also investigates the opinion coherence between the answer and crowds' opinions reflected in the reviews, which is another important factor to identify helpful answers. Moreover, we tackle the task of determining opinion coherence as a language inference problem and explore the utilization of pre-training strategy to transfer the textual inference knowledge obtained from a specifically designed trained network. Extensive experiments conducted on real-world data across seven product categories show that our proposed model achieves superior performance on the prediction task.Comment: Accepted by WWW202

    Multitask Learning with Deep Neural Networks for Community Question Answering

    Get PDF
    In this paper, we developed a deep neural network (DNN) that learns to solve simultaneously the three tasks of the cQA challenge proposed by the SemEval-2016 Task 3, i.e., question-comment similarity, question-question similarity and new question-comment similarity. The latter is the main task, which can exploit the previous two for achieving better results. Our DNN is trained jointly on all the three cQA tasks and learns to encode questions and comments into a single vector representation shared across the multiple tasks. The results on the official challenge test set show that our approach produces higher accuracy and faster convergence rates than the individual neural networks. Additionally, our method, which does not use any manual feature engineering, approaches the state of the art established with methods that make heavy use of it
    corecore