199 research outputs found
A survey on the use of relevance feedback for information access systems
Users of online search engines often find it difficult to express their need for information in the form of a query. However, if the user can identify examples of the kind of documents they require then they can employ a technique known as relevance feedback. Relevance feedback covers a range of techniques intended to improve a user's query and facilitate retrieval of information relevant to a user's information need. In this paper we survey relevance feedback techniques. We study both automatic techniques, in which the system modifies the user's query, and interactive techniques, in which the user has control over query modification. We also consider specific interfaces to relevance feedback systems and characteristics of searchers that can affect the use and success of relevance feedback systems
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
Recommended from our members
Query exhaustivity, relevance feedback and search success in automatic and interactive query expansion
This study explored how the expression of search facets and relevance feedback by users was related to search success in interactive and automatic query expansion in the course of the search process. Search success was measured both in the number of relevant documents retrieved and relevance scores of these items based on a four point scaling. Research design consisted of 26 users searching for four TREC topics in Okapi IR system, half using interactive and half automatic query expansion based on RF. The search logs were recorded, and the users filled in a questionnaire for each topic concerning various features of searching. The results showed that the exhaustivity of the query was the most significant predictor of search success, and that interactive expansion led to better search success than automatic one
Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review
E-discovery processes that use automated tools to prioritize and select documents for review are typically regarded as potential cost-savers – but inferior alternatives – to exhaustive manual review, in which a cadre of reviewers assesses every document for responsiveness to a production request, and for privilege. This Article offers evidence that such technology-assisted processes, while indeed more efficient, can also yield results superior to those of exhaustive manual review, as measured by recall and precision, as well as F1, a summary measure combining both recall and precision. The evidence derives from an analysis of data collected from the TREC 2009 Legal Track Interactive Task, and shows that, at TREC 2009, technology-assisted review processes enabled two participating teams to achieve results superior to those that could have been achieved through a manual review of the entire document collection by the official TREC assessors
Recommended from our members
Building on Redundancy: Factoid Question Answering, Robust Retrieval and the "Other"
We have explored how redundancy based techniques can be used in improving factoid question answering, definitional
questions (“other”), and robust retrieval. For the factoids, we explored the meta approach: we submit the questions to the
several open domain question answering systems available on the Web and applied our redundancy-based triangulation
algorithm to analyze their outputs in order to identify the most promising answers. Our results support the added value of the
meta approach: the performance of the combined system surpassed the underlying performances of its components. To
answer definitional (“other”) questions, we were looking for the sentences containing re-occurring pairs of noun entities
containing the elements of the target. For robust retrieval, we applied our redundancy based Internet mining technique to
identify the concepts (single word terms or phrases) that were highly related to the topic (query) and expanded the queries
with them. All our results are above the mean performance in the categories in which we have participated, with one of our
robust runs being the best in its category among all 24 participants. Overall, our findings support the hypothesis that using as
much as possible textual data, specifically such as mined from the World Wide Web, is extremely promising.published_or_final_versio
- …