627 research outputs found
Query Resolution for Conversational Search with Limited Supervision
In this work we focus on multi-turn passage retrieval as a crucial component
of conversational search. One of the key challenges in multi-turn passage
retrieval comes from the fact that the current turn query is often
underspecified due to zero anaphora, topic change, or topic return. Context
from the conversational history can be used to arrive at a better expression of
the current turn query, defined as the task of query resolution. In this paper,
we model the query resolution task as a binary term classification problem: for
each term appearing in the previous turns of the conversation decide whether to
add it to the current turn query or not. We propose QuReTeC (Query Resolution
by Term Classification), a neural query resolution model based on bidirectional
transformers. We propose a distant supervision method to automatically generate
training data by using query-passage relevance labels. Such labels are often
readily available in a collection either as human annotations or inferred from
user interactions. We show that QuReTeC outperforms state-of-the-art models,
and furthermore, that our distant supervision method can be used to
substantially reduce the amount of human-curated data required to train
QuReTeC. We incorporate QuReTeC in a multi-turn, multi-stage passage retrieval
architecture and demonstrate its effectiveness on the TREC CAsT dataset.Comment: SIGIR 2020 full conference pape
Follow-up question handling in the IMIX and Ritel systems: A comparative study
One of the basic topics of question answering (QA) dialogue systems is how follow-up questions should be interpreted by a QA system. In this paper, we shall discuss our experience with the IMIX and Ritel systems, for both of which a follow-up question handling scheme has been developed, and corpora have been collected. These two systems are each other's opposites in many respects: IMIX is multimodal, non-factoid, black-box QA, while Ritel is speech, factoid, keyword-based QA. Nevertheless, we will show that they are quite comparable, and that it is fruitful to examine the similarities and differences. We shall look at how the systems are composed, and how real, non-expert, users interact with the systems. We shall also provide comparisons with systems from the literature where possible, and indicate where open issues lie and in what areas existing systems may be improved. We conclude that most systems have a common architecture with a set of common subtasks, in particular detecting follow-up questions and finding referents for them. We characterise these tasks using the typical techniques used for performing them, and data from our corpora. We also identify a special type of follow-up question, the discourse question, which is asked when the user is trying to understand an answer, and propose some basic methods for handling it
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
Finding Structured and Unstructured Features to Improve the Search Result of Complex Question
-Recently, search engine got challenge deal with such a natural language questions.
Sometimes, these questions are complex questions. A complex question is a question that
consists several clauses, several intentions or need long answer.
In this work we proposed that finding structured features and unstructured features of
questions and using structured data and unstructured data could improve the search result
of complex questions. According to those, we will use two approaches, IR approach and
structured retrieval, QA template.
Our framework consists of three parts. Question analysis, Resource Discovery and
Analysis The Relevant Answer. In Question Analysis we used a few assumptions, and
tried to find structured and unstructured features of the questions. Structured feature
refers to Structured data and unstructured feature refers to unstructured data. In the
resource discovery we integrated structured data (relational database) and unstructured
data (webpage) to take the advantaged of two kinds of data to improve and reach the
relevant answer. We will find the best top fragments from context of the webpage In the
Relevant Answer part, we made a score matching between the result from structured data
and unstructured data, then finally used QA template to reformulate the question.
In the experiment result, it shows that using structured feature and unstructured
feature and using both structured and unstructured data, using approach IR and QA
template could improve the search result of complex questions
Designing Semantic Kernels as Implicit Superconcept Expansions
Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel
Spanish question answering evaluation
This paper reports the most significant issues related to the launching of a Monolingual Spanish Question Answering evaluation track at the Cross Language Evaluation Forum (CLEF 2003). It introduces some questions about multilingualism and describes the methodology for test suite production, task, judgment of answers as well as the results obtained by the participant systems
- …