1,258 research outputs found

    ANTIQUE: A Non-Factoid Question Answering Benchmark

    Full text link
    Considering the widespread use of mobile and voice search, answer passage retrieval for non-factoid questions plays a critical role in modern information retrieval systems. Despite the importance of the task, the community still feels the significant lack of large-scale non-factoid question answering collections with real questions and comprehensive relevance judgments. In this paper, we develop and release a collection of 2,626 open-domain non-factoid questions from a diverse set of categories. The dataset, called ANTIQUE, contains 34,011 manual relevance annotations. The questions were asked by real users in a community question answering service, i.e., Yahoo! Answers. Relevance judgments for all the answers to each question were collected through crowdsourcing. To facilitate further research, we also include a brief analysis of the data as well as baseline results on both classical and recently developed neural IR models

    A syntactic candidate ranking method for answering non-copulative questions

    Get PDF
    Question answering (QA) is the act of retrieving answers to questions posed in natural language. It is regarded as requiring more complex natural language processing (NLP) techniques than other types of information retrieval such as document retrieval. QA is sometimes regarded as the next step beyond search engines that ranks the retrieved candidates. Given a set of candidate sentences which contain keywords in common with the question, deciding which one actually answers the question is a challenge in question answering. In this thesis we propose a linguistic method for measuring the syntactic similarity of each candidate sentence to the question. This candidate scoring method uses the question head as an anchor to narrow down the search to a subtree in the parse tree of a candidate sentence (the target subtree). Semantic similarity of the action in the target subtree to the action asked in the question is then measured using WordNet::Similarity on their main verbs. In order to verify the syntactic similarity of this subtree to the question parse tree, syntactic restrictions as well as lexical measures compute the unifiability of critical syntactic participants in them. Finally, the noun phrase that is of the expected answer type in the target subtree is extracted and returned from the best candidate sentence when answering a factoid open domain question. In this thesis, we address both closed and open domain question answering problems. Initially, we propose our syntactic scoring method as a solution for questions in the Telecommunications domain. For our experiments in a closed domain, we build a set of customer service question/answer pairs from Bell Canada's Web pages. We show that the performance of this ranking method depends on the syntactic and lexical similarities in a question/answer pair. We observed that these closed domain questions ask for specific properties, procedures, or conditions about a technical topic. They are sometimes open-ended as well. As a result, detailed understanding of the question and the corpus text is required for answering them. As opposed to closed domain question, however, open domain questions have no restriction on the topic they can ask. The standard test bed for open domain question answering is the question/answer sets provided each year by the NIST organization through the TREC QA conferences. These are factoid questions that ask about a person, date, time, location, etc. Since our method relies on the semantic similarity of the main verbs as well as the syntactic overlap of counterpart subtrees from the question and the target subtrees, it performs well on questions with a main content verb and conventional subject-verb-object syntactic structure. The distribution of this type of questions versus questions having a 'to be' main verb is significantly different in closed versus open domain: around 70% of closed domain questions have a main content verb while more than 67% of open domain questions have a 'to be' main verb. This verb is very flexibility in connecting sentence entities. Therefore, recognizing equivallent syntactic structures between two copula parse trees is very hard. As a result, to better analyze the accuracy of this method, we create a new question categorization based on the question's main verb type: copulative questions ask about a state using a 'to be' verb, while non-copulative questions contain a main non-copula verb indicating an action or event. Our candidate answer ranking method achieves a precision of 47.0% in our closed domain, and 48% in answering the TREC 2003 to 2006 non-copulative questions. For answering open domain factoid questions, we feed the output of Aranea, a competitive question answering system in TREC 2002, to our linguistic method in order to provide it with Web redundancy statistics. This level of performance confirms our hypothesis of the potential usefulness of syntactic mapping for answering questions with a main content verb

    Follow-up question handling in the IMIX and Ritel systems: A comparative study

    Get PDF
    One of the basic topics of question answering (QA) dialogue systems is how follow-up questions should be interpreted by a QA system. In this paper, we shall discuss our experience with the IMIX and Ritel systems, for both of which a follow-up question handling scheme has been developed, and corpora have been collected. These two systems are each other's opposites in many respects: IMIX is multimodal, non-factoid, black-box QA, while Ritel is speech, factoid, keyword-based QA. Nevertheless, we will show that they are quite comparable, and that it is fruitful to examine the similarities and differences. We shall look at how the systems are composed, and how real, non-expert, users interact with the systems. We shall also provide comparisons with systems from the literature where possible, and indicate where open issues lie and in what areas existing systems may be improved. We conclude that most systems have a common architecture with a set of common subtasks, in particular detecting follow-up questions and finding referents for them. We characterise these tasks using the typical techniques used for performing them, and data from our corpora. We also identify a special type of follow-up question, the discourse question, which is asked when the user is trying to understand an answer, and propose some basic methods for handling it

    ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters

    Get PDF
    To bridge the gap between the capabilities of the state-of-the-art in factoid question answering (QA) and what users ask, we need large datasets of real user questions that capture the various question phenomena users are interested in, and the diverse ways in which these questions are formulated. We introduce ComQA, a large dataset of real user questions that exhibit different challenging aspects such as compositionality, temporal reasoning, and comparisons. ComQA questions come from the WikiAnswers community QA platform, which typically contains questions that are not satisfactorily answerable by existing search engine technology. Through a large crowdsourcing effort, we clean the question dataset, group questions into paraphrase clusters, and annotate clusters with their answers. ComQA contains 11,214 questions grouped into 4,834 paraphrase clusters. We detail the process of constructing ComQA, including the measures taken to ensure its high quality while making effective use of crowdsourcing. We also present an extensive analysis of the dataset and the results achieved by state-of-the-art systems on ComQA, demonstrating that our dataset can be a driver of future research on QA.Comment: 11 pages, NAACL 201

    Hi, how can I help you?: Automating enterprise IT support help desks

    Full text link
    Question answering is one of the primary challenges of natural language understanding. In realizing such a system, providing complex long answers to questions is a challenging task as opposed to factoid answering as the former needs context disambiguation. The different methods explored in the literature can be broadly classified into three categories namely: 1) classification based, 2) knowledge graph based and 3) retrieval based. Individually, none of them address the need of an enterprise wide assistance system for an IT support and maintenance domain. In this domain the variance of answers is large ranging from factoid to structured operating procedures; the knowledge is present across heterogeneous data sources like application specific documentation, ticket management systems and any single technique for a general purpose assistance is unable to scale for such a landscape. To address this, we have built a cognitive platform with capabilities adopted for this domain. Further, we have built a general purpose question answering system leveraging the platform that can be instantiated for multiple products, technologies in the support domain. The system uses a novel hybrid answering model that orchestrates across a deep learning classifier, a knowledge graph based context disambiguation module and a sophisticated bag-of-words search system. This orchestration performs context switching for a provided question and also does a smooth hand-off of the question to a human expert if none of the automated techniques can provide a confident answer. This system has been deployed across 675 internal enterprise IT support and maintenance projects.Comment: To appear in IAAI 201
    • …
    corecore