160 research outputs found
Learning a Deep Listwise Context Model for Ranking Refinement
Learning to rank has been intensively studied and widely applied in
information retrieval. Typically, a global ranking function is learned from a
set of labeled data, which can achieve good performance on average but may be
suboptimal for individual queries by ignoring the fact that relevant documents
for different queries may have different distributions in the feature space.
Inspired by the idea of pseudo relevance feedback where top ranked documents,
which we refer as the \textit{local ranking context}, can provide important
information about the query's characteristics, we propose to use the inherent
feature distributions of the top results to learn a Deep Listwise Context Model
that helps us fine tune the initial ranked list. Specifically, we employ a
recurrent neural network to sequentially encode the top results using their
feature vectors, learn a local context model and use it to re-rank the top
results. There are three merits with our model: (1) Our model can capture the
local ranking context based on the complex interactions between top results
using a deep neural network; (2) Our model can be built upon existing
learning-to-rank methods by directly using their extracted feature vectors; (3)
Our model is trained with an attention-based loss function, which is more
effective and efficient than many existing listwise methods. Experimental
results show that the proposed model can significantly improve the
state-of-the-art learning to rank methods on benchmark retrieval corpora
Unconfounded Propensity Estimation for Unbiased Ranking
The goal of unbiased learning to rank (ULTR) is to leverage implicit user
feedback for optimizing learning-to-rank systems. Among existing solutions,
automatic ULTR algorithms that jointly learn user bias models (i.e., propensity
models) with unbiased rankers have received a lot of attention due to their
superior performance and low deployment cost in practice. Despite their
theoretical soundness, the effectiveness is usually justified under a weak
logging policy, where the ranking model can barely rank documents according to
their relevance to the query. However, when the logging policy is strong, e.g.,
an industry-deployed ranking policy, the reported effectiveness cannot be
reproduced. In this paper, we first investigate ULTR from a causal perspective
and uncover a negative result: existing ULTR algorithms fail to address the
issue of propensity overestimation caused by the query-document relevance
confounder. Then, we propose a new learning objective based on backdoor
adjustment and highlight its differences from conventional propensity models,
which reveal the prevalence of propensity overestimation. On top of that, we
introduce a novel propensity model called Logging-Policy-aware Propensity (LPP)
model and its distinctive two-step optimization strategy, which allows for the
joint learning of LPP and ranking models within the automatic ULTR framework,
and actualize the unconfounded propensity estimation for ULTR. Extensive
experiments on two benchmarks demonstrate the effectiveness and
generalizability of the proposed method.Comment: 11 pages, 5 figure
Generalized Weak Supervision for Neural Information Retrieval
Neural ranking models (NRMs) have demonstrated effective performance in
several information retrieval (IR) tasks. However, training NRMs often requires
large-scale training data, which is difficult and expensive to obtain. To
address this issue, one can train NRMs via weak supervision, where a large
dataset is automatically generated using an existing ranking model (called the
weak labeler) for training NRMs. Weakly supervised NRMs can generalize from the
observed data and significantly outperform the weak labeler. This paper
generalizes this idea through an iterative re-labeling process, demonstrating
that weakly supervised models can iteratively play the role of weak labeler and
significantly improve ranking performance without using manually labeled data.
The proposed Generalized Weak Supervision (GWS) solution is generic and
orthogonal to the ranking model architecture. This paper offers four
implementations of GWS: self-labeling, cross-labeling, joint cross- and
self-labeling, and greedy multi-labeling. GWS also benefits from a query
importance weighting mechanism based on query performance prediction methods to
reduce noise in the generated training data. We further draw a theoretical
connection between self-labeling and Expectation-Maximization. Our experiments
on two passage retrieval benchmarks suggest that all implementations of GWS
lead to substantial improvements compared to weak supervision in all cases
- …