1,258 research outputs found
ANTIQUE: A Non-Factoid Question Answering Benchmark
Considering the widespread use of mobile and voice search, answer passage
retrieval for non-factoid questions plays a critical role in modern information
retrieval systems. Despite the importance of the task, the community still
feels the significant lack of large-scale non-factoid question answering
collections with real questions and comprehensive relevance judgments. In this
paper, we develop and release a collection of 2,626 open-domain non-factoid
questions from a diverse set of categories. The dataset, called ANTIQUE,
contains 34,011 manual relevance annotations. The questions were asked by real
users in a community question answering service, i.e., Yahoo! Answers.
Relevance judgments for all the answers to each question were collected through
crowdsourcing. To facilitate further research, we also include a brief analysis
of the data as well as baseline results on both classical and recently
developed neural IR models
A syntactic candidate ranking method for answering non-copulative questions
Question answering (QA) is the act of retrieving answers to questions posed in natural language. It is regarded as requiring more complex natural language processing (NLP) techniques than other types of information retrieval such as document retrieval. QA is sometimes regarded as the next step beyond search engines that ranks the retrieved candidates. Given a set of candidate sentences which contain keywords in common with the question, deciding which one actually answers the question is a challenge in question answering. In this thesis we propose a linguistic method for measuring the syntactic similarity of each candidate sentence to the question. This candidate scoring method uses the question head as an anchor to narrow down the search to a subtree in the parse tree of a candidate sentence (the target subtree). Semantic similarity of the action in the target subtree to the action asked in the question is then measured using WordNet::Similarity on their main verbs. In order to verify the syntactic similarity of this subtree to the question parse tree, syntactic restrictions as well as lexical measures compute the unifiability of critical syntactic participants in them. Finally, the noun phrase that is of the expected answer type in the target subtree is extracted and returned from the best candidate sentence when answering a factoid open domain question. In this thesis, we address both closed and open domain question answering problems. Initially, we propose our syntactic scoring method as a solution for questions in the Telecommunications domain. For our experiments in a closed domain, we build a set of customer service question/answer pairs from Bell Canada's Web pages. We show that the performance of this ranking method depends on the syntactic and lexical similarities in a question/answer pair. We observed that these closed domain questions ask for specific properties, procedures, or conditions about a technical topic. They are sometimes open-ended as well. As a result, detailed understanding of the question and the corpus text is required for answering them. As opposed to closed domain question, however, open domain questions have no restriction on the topic they can ask. The standard test bed for open domain question answering is the question/answer sets provided each year by the NIST organization through the TREC QA conferences. These are factoid questions that ask about a person, date, time, location, etc. Since our method relies on the semantic similarity of the main verbs as well as the syntactic overlap of counterpart subtrees from the question and the target subtrees, it performs well on questions with a main content verb and conventional subject-verb-object syntactic structure. The distribution of this type of questions versus questions having a 'to be' main verb is significantly different in closed versus open domain: around 70% of closed domain questions have a main content verb while more than 67% of open domain questions have a 'to be' main verb. This verb is very flexibility in connecting sentence entities. Therefore, recognizing equivallent syntactic structures between two copula parse trees is very hard. As a result, to better analyze the accuracy of this method, we create a new question categorization based on the question's main verb type: copulative questions ask about a state using a 'to be' verb, while non-copulative questions contain a main non-copula verb indicating an action or event. Our candidate answer ranking method achieves a precision of 47.0% in our closed domain, and 48% in answering the TREC 2003 to 2006 non-copulative questions. For answering open domain factoid questions, we feed the output of Aranea, a competitive question answering system in TREC 2002, to our linguistic method in order to provide it with Web redundancy statistics. This level of performance confirms our hypothesis of the potential usefulness of syntactic mapping for answering questions with a main content verb
Follow-up question handling in the IMIX and Ritel systems: A comparative study
One of the basic topics of question answering (QA) dialogue systems is how follow-up questions should be interpreted by a QA system. In this paper, we shall discuss our experience with the IMIX and Ritel systems, for both of which a follow-up question handling scheme has been developed, and corpora have been collected. These two systems are each other's opposites in many respects: IMIX is multimodal, non-factoid, black-box QA, while Ritel is speech, factoid, keyword-based QA. Nevertheless, we will show that they are quite comparable, and that it is fruitful to examine the similarities and differences. We shall look at how the systems are composed, and how real, non-expert, users interact with the systems. We shall also provide comparisons with systems from the literature where possible, and indicate where open issues lie and in what areas existing systems may be improved. We conclude that most systems have a common architecture with a set of common subtasks, in particular detecting follow-up questions and finding referents for them. We characterise these tasks using the typical techniques used for performing them, and data from our corpora. We also identify a special type of follow-up question, the discourse question, which is asked when the user is trying to understand an answer, and propose some basic methods for handling it
ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters
To bridge the gap between the capabilities of the state-of-the-art in factoid
question answering (QA) and what users ask, we need large datasets of real user
questions that capture the various question phenomena users are interested in,
and the diverse ways in which these questions are formulated. We introduce
ComQA, a large dataset of real user questions that exhibit different
challenging aspects such as compositionality, temporal reasoning, and
comparisons. ComQA questions come from the WikiAnswers community QA platform,
which typically contains questions that are not satisfactorily answerable by
existing search engine technology. Through a large crowdsourcing effort, we
clean the question dataset, group questions into paraphrase clusters, and
annotate clusters with their answers. ComQA contains 11,214 questions grouped
into 4,834 paraphrase clusters. We detail the process of constructing ComQA,
including the measures taken to ensure its high quality while making effective
use of crowdsourcing. We also present an extensive analysis of the dataset and
the results achieved by state-of-the-art systems on ComQA, demonstrating that
our dataset can be a driver of future research on QA.Comment: 11 pages, NAACL 201
Hi, how can I help you?: Automating enterprise IT support help desks
Question answering is one of the primary challenges of natural language
understanding. In realizing such a system, providing complex long answers to
questions is a challenging task as opposed to factoid answering as the former
needs context disambiguation. The different methods explored in the literature
can be broadly classified into three categories namely: 1) classification
based, 2) knowledge graph based and 3) retrieval based. Individually, none of
them address the need of an enterprise wide assistance system for an IT support
and maintenance domain. In this domain the variance of answers is large ranging
from factoid to structured operating procedures; the knowledge is present
across heterogeneous data sources like application specific documentation,
ticket management systems and any single technique for a general purpose
assistance is unable to scale for such a landscape. To address this, we have
built a cognitive platform with capabilities adopted for this domain. Further,
we have built a general purpose question answering system leveraging the
platform that can be instantiated for multiple products, technologies in the
support domain. The system uses a novel hybrid answering model that
orchestrates across a deep learning classifier, a knowledge graph based context
disambiguation module and a sophisticated bag-of-words search system. This
orchestration performs context switching for a provided question and also does
a smooth hand-off of the question to a human expert if none of the automated
techniques can provide a confident answer. This system has been deployed across
675 internal enterprise IT support and maintenance projects.Comment: To appear in IAAI 201
- …