859 research outputs found
Investigating Retrieval Method Selection with Axiomatic Features
We consider algorithm selection in the context of ad-hoc information retrieval. Given a query and a pair of retrieval methods, we propose a meta-learner that predicts how to combine the methods' relevance scores into an overall relevance score. Inspired by neural models' different properties with regard to IR axioms, these predictions are based on features that quantify axiom-related properties of the query and its top ranked documents. We conduct an evaluation on TREC Web Track data and find that the meta-learner often significantly improves over the individual methods. Finally, we conduct feature and query weight analyses to investigate the meta-learner's behavior
Explainable Information Retrieval: A Survey
Explainable information retrieval is an emerging research area aiming to make
transparent and trustworthy information retrieval systems. Given the increasing
use of complex machine learning models in search systems, explainability is
essential in building and auditing responsible information retrieval models.
This survey fills a vital gap in the otherwise topically diverse literature of
explainable information retrieval. It categorizes and discusses recent
explainability methods developed for different application domains in
information retrieval, providing a common framework and unifying perspectives.
In addition, it reflects on the common concern of evaluating explanations and
highlights open challenges and opportunities.Comment: 35 pages, 10 figures. Under revie
How Different are Pre-trained Transformers for Text Ranking?
In recent years, large pre-trained transformers have led to substantial gains
in performance over traditional retrieval models and feedback approaches.
However, these results are primarily based on the MS Marco/TREC Deep Learning
Track setup, with its very particular setup, and our understanding of why and
how these models work better is fragmented at best. We analyze effective
BERT-based cross-encoders versus traditional BM25 ranking for the passage
retrieval task where the largest gains have been observed, and investigate two
main questions. On the one hand, what is similar? To what extent does the
neural ranker already encompass the capacity of traditional rankers? Is the
gain in performance due to a better ranking of the same documents (prioritizing
precision)? On the other hand, what is different? Can it retrieve effectively
documents missed by traditional systems (prioritizing recall)? We discover
substantial differences in the notion of relevance identifying strengths and
weaknesses of BERT that may inspire research for future improvement. Our
results contribute to our understanding of (black-box) neural rankers relative
to (well-understood) traditional rankers, help understand the particular
experimental setting of MS-Marco-based test collections.Comment: ECIR 202
A study on the Interpretability of Neural Retrieval Models using DeepSHAP
A recent trend in IR has been the usage of neural networks to learn retrieval
models for text based adhoc search. While various approaches and architectures
have yielded significantly better performance than traditional retrieval models
such as BM25, it is still difficult to understand exactly why a document is
relevant to a query. In the ML community several approaches for explaining
decisions made by deep neural networks have been proposed -- including DeepSHAP
which modifies the DeepLift algorithm to estimate the relative importance
(shapley values) of input features for a given decision by comparing the
activations in the network for a given image against the activations caused by
a reference input. In image classification, the reference input tends to be a
plain black image. While DeepSHAP has been well studied for image
classification tasks, it remains to be seen how we can adapt it to explain the
output of Neural Retrieval Models (NRMs). In particular, what is a good "black"
image in the context of IR? In this paper we explored various reference input
document construction techniques. Additionally, we compared the explanations
generated by DeepSHAP to LIME (a model agnostic approach) and found that the
explanations differ considerably. Our study raises concerns regarding the
robustness and accuracy of explanations produced for NRMs. With this paper we
aim to shed light on interesting problems surrounding interpretability in NRMs
and highlight areas of future work.Comment: 4 pages; SIGIR 2019 Short Pape
ABNIRML: Analyzing the Behavior of Neural IR Models
Numerous studies have demonstrated the effectiveness of pretrained
contextualized language models such as BERT and T5 for ad-hoc search. However,
it is not well-understood why these methods are so effective, what makes some
variants more effective than others, and what pitfalls they may have. We
present a new comprehensive framework for Analyzing the Behavior of Neural IR
ModeLs (ABNIRML), which includes new types of diagnostic tests that allow us to
probe several characteristics---such as sensitivity to word order---that are
not addressed by previous techniques. To demonstrate the value of the
framework, we conduct an extensive empirical study that yields insights into
the factors that contribute to the neural model's gains, and identify potential
unintended biases the models exhibit. We find evidence that recent neural
ranking models have fundamentally different characteristics from prior ranking
models. For instance, these models can be highly influenced by altered document
word order, sentence order and inflectional endings. They can also exhibit
unexpected behaviors when additional content is added to documents, or when
documents are expressed with different levels of fluency or formality. We find
that these differences can depend on the architecture and not just the
underlying language model
CBR and MBR techniques: review for an application in the emergencies domain
The purpose of this document is to provide an in-depth analysis of current reasoning engine practice and the integration strategies of Case Based Reasoning and Model Based Reasoning that will be used in the design and development of the RIMSAT system.
RIMSAT (Remote Intelligent Management Support and Training) is a European Commission funded project designed to:
a.. Provide an innovative, 'intelligent', knowledge based solution aimed at improving the quality of critical decisions
b.. Enhance the competencies and responsiveness of individuals and organisations involved in highly complex, safety critical incidents - irrespective of their location.
In other words, RIMSAT aims to design and implement a decision support system that using Case Base Reasoning as well as Model Base Reasoning technology is applied in the management of emergency situations.
This document is part of a deliverable for RIMSAT project, and although it has been done in close contact with the requirements of the project, it provides an overview wide enough for providing a state of the art in integration strategies between CBR and MBR technologies.Postprint (published version
- …