59,325 research outputs found
Local and global query expansion for hierarchical complex topics
In this work we study local and global methods for query expansion for multifaceted complex topics. We study word-based and entity-based expansion methods and extend these approaches to complex topics using fine-grained expansion on different elements of the hierarchical query structure. For a source of hierarchical complex topics we use the TREC Complex Answer Retrieval (CAR) benchmark data collection. We find that leveraging the hierarchical topic structure is needed for both local and global expansion methods to be effective. Further, the results demonstrate that entity-based expansion methods show significant gains over word-based models alone, with local feedback providing the largest improvement. The results on the CAR paragraph retrieval task demonstrate that expansion models that incorporate both the hierarchical query structure and entity-based expansion result in a greater than 20% improvement over word-based expansion approaches
Exploring sentence level query expansion in language modeling based information retrieval
We introduce two novel methods for query expansion in information retrieval (IR). The basis of these methods is to add the most similar sentences extracted from
pseudo-relevant documents to the original query. The first method adds a fixed number of sentences to the original query, the second a progressively decreasing number of sentences. We evaluate these methods on the English and Bengali test collections from the FIRE workshops. The major
findings of this study are that: i) performance is similar for both English and Bengali; ii) employing a smaller context (similar sentences) yields a considerably higher
mean average precision (MAP) compared to extracting terms from full documents (up to 5.9% improvemnent in MAP for
English and 10.7% for Bengali compared to standard Blind Relevance Feedback (BRF); iii) using a variable number of sentences for query expansion performs better and shows less variance in the best MAP for different parameter settings; iv) query expansion based on sentences can
improve performance even for topics with low initial retrieval precision where standard BRF fails
Combining Language Models with NLP and Interactive Query Expansion.
International audienceFollowing our previous participation in INEX 2008 Ad-hoc track, we continue to address both standard and focused retrieval tasks based on comprehensible language models and interactive query expansion (IQE). Query topics are expanded using an initial set of Multiword Terms (MWTs) selected from top n ranked documents. In this experiment, we extract MWTs from article titles, narrative field and automatically generated summaries. We combined the initial set of MWTs obtained in an IQE process with automatic query expansion (AQE) using language models and smoothing mechanism. We chose as baseline the Indri IR engine based on the language model using Dirichlet smoothing. We also compare the performance of bag of word approaches (TFIDF and BM25) to search strategies elaborated using language model and query expansion (QE). The experiment is carried out on all INEX 2009 Ad-hoc tasks
Recommended from our members
Query exhaustivity, relevance feedback and search success in automatic and interactive query expansion
This study explored how the expression of search facets and relevance feedback by users was related to search success in interactive and automatic query expansion in the course of the search process. Search success was measured both in the number of relevant documents retrieved and relevance scores of these items based on a four point scaling. Research design consisted of 26 users searching for four TREC topics in Okapi IR system, half using interactive and half automatic query expansion based on RF. The search logs were recorded, and the users filled in a questionnaire for each topic concerning various features of searching. The results showed that the exhaustivity of the query was the most significant predictor of search success, and that interactive expansion led to better search success than automatic one
Comparative evaluation of query expansion methods for enhanced search on microblog data: DCU ADAPT @ SMERP 2017 workshop data challenge
The rapid growth in the availability of social media content
posted during emergency situations is creating significant interest in research into how this information can be exploited to assist emergency
relief operations and to help with emergency preparedness and in early
warning systems. We describe the DCU ADAPT Centre participation
in the microblog search data challenge at the SMERP 2017 workshop.
This task aimed to promote development of information retrieval (IR)
methods for practical challenges that need to be addressed during an
emergency event, along with comparative evaluation of the methodologies developed for this task. The task is based on a large dataset of microblogs posted during the earthquake in Italy in August 2016, together
with a set of query topics provided by the task organisers. For our participation in this task we explored use of three different IR techniques:
standard IR query expansion based on an external resource, query expansion based on WordNet and use of query expansio
GRAPHENE: A Precise Biomedical Literature Retrieval Engine with Graph Augmented Deep Learning and External Knowledge Empowerment
Effective biomedical literature retrieval (BLR) plays a central role in
precision medicine informatics. In this paper, we propose GRAPHENE, which is a
deep learning based framework for precise BLR. GRAPHENE consists of three main
different modules 1) graph-augmented document representation learning; 2) query
expansion and representation learning and 3) learning to rank biomedical
articles. The graph-augmented document representation learning module
constructs a document-concept graph containing biomedical concept nodes and
document nodes so that global biomedical related concept from external
knowledge source can be captured, which is further connected to a BiLSTM so
both local and global topics can be explored. Query expansion and
representation learning module expands the query with abbreviations and
different names, and then builds a CNN-based model to convolve the expanded
query and obtain a vector representation for each query. Learning to rank
minimizes a ranking loss between biomedical articles with the query to learn
the retrieval function. Experimental results on applying our system to TREC
Precision Medicine track data are provided to demonstrate its effectiveness.Comment: CIKM 201
Integrating multiple windows and document features for expert finding
Expert finding is a key task in enterprise search and has recently attracted lots of attention from both research and industry communities. Given a search topic, a prominent existing approach is to apply some information retrieval (IR) system to retrieve top ranking documents, which will then be used to derive associations between experts and the search topic based on cooccurrences. However, we argue that expert finding is more sensitive to multiple levels of associations and document features that current expert finding systems insufficiently address, including (a) multiple levels of associations between experts and search topics, (b) document internal structure, and (c) document authority. We propose a novel approach that integrates the above-mentioned three aspects as well as a query expansion technique in a two-stage model for expert finding. A systematic evaluation is conducted on TREC collections to test the performance of our approach as well as the effects of multiple windows, document features, and query expansion. These experimental results show that query expansion can dramatically improve expert finding performance with statistical significance. For three well-known IR models with or without query expansion, document internal structures help improve a single window-based approach but without statistical significance, while our novel multiple window-based approach can significantly improve the performance of a single window-based approach both with and without document internal structures
Characterizing Question Facets for Complex Answer Retrieval
Complex answer retrieval (CAR) is the process of retrieving answers to
questions that have multifaceted or nuanced answers. In this work, we present
two novel approaches for CAR based on the observation that question facets can
vary in utility: from structural (facets that can apply to many similar topics,
such as 'History') to topical (facets that are specific to the question's
topic, such as the 'Westward expansion' of the United States). We first explore
a way to incorporate facet utility into ranking models during query term score
combination. We then explore a general approach to reform the structure of
ranking models to aid in learning of facet utility in the query-document term
matching phase. When we use our techniques with a leading neural ranker on the
TREC CAR dataset, our methods rank first in the 2017 TREC CAR benchmark, and
yield up to 26% higher performance than the next best method.Comment: 4 pages; SIGIR 2018 Short Pape
Search Patterns Through a Health-Information Site: Considering the Need for Complex Subject Indexing
This study considers the impact of taxonomy development on user query-expansion patterns at NC Health Info, a Web database of North Carolina online health and medical resources. In consideration of simplifying NC Health Info's taxonomy, user session logs were analyzed for selection frequency of general and specific topics and directional patterns between general and specific topics as initial and subsequent selectors. Based on a sampling of session logs over a seven-month period, users exhibited no clear preference for general or specific topics. In an analysis of topics deemed crucial to North Carolinians by a governor's task force, patterns illustrated a significant preference for specific topics over general topics. This research, and the results of previous studies regarding taxonomy development and query-expansion, suggests that a simple taxonomy would less effectively serve users
Using WordNet for query expansion: ADAPT @ FIRE 2016 microblog track
User-generated content on social websites such as Twitter
is known to be an important source of real-time information on significant events as they occur, for example natural
disasters. Our participation in the FIRE 2016 Microblog
track, seeks to exploit WordNet as an external resource
for synonym-based query expansion to support improved
matching between search topics and the target Tweet collection. The results of our participation in this task show that
this is an effective method for use with a standard BM25
based information retrieval system for this task
- …