2,209 research outputs found
Controlling Fairness and Bias in Dynamic Learning-to-Rank
Rankings are the primary interface through which many online platforms match
users to items (e.g. news, products, music, video). In these two-sided markets,
not only the users draw utility from the rankings, but the rankings also
determine the utility (e.g. exposure, revenue) for the item providers (e.g.
publishers, sellers, artists, studios). It has already been noted that
myopically optimizing utility to the users, as done by virtually all
learning-to-rank algorithms, can be unfair to the item providers. We,
therefore, present a learning-to-rank approach for explicitly enforcing
merit-based fairness guarantees to groups of items (e.g. articles by the same
publisher, tracks by the same artist). In particular, we propose a learning
algorithm that ensures notions of amortized group fairness, while
simultaneously learning the ranking function from implicit feedback data. The
algorithm takes the form of a controller that integrates unbiased estimators
for both fairness and utility, dynamically adapting both as more data becomes
available. In addition to its rigorous theoretical foundation and convergence
guarantees, we find empirically that the algorithm is highly practical and
robust.Comment: First two authors contributed equally. In Proceedings of the 43rd
International ACM SIGIR Conference on Research and Development in Information
Retrieval 202
Measuring Fairness in Ranked Results: An Analytical and Empirical Comparison
Information access systems, such as search and recommender systems, often use ranked lists to present results believed to be relevant to the user\u27s information need. Evaluating these lists for their fairness along with other traditional metrics provides a more complete understanding of an information access system\u27s behavior beyond accuracy or utility constructs. To measure the (un)fairness of rankings, particularly with respect to the protected group(s) of producers or providers, several metrics have been proposed in the last several years. However, an empirical and comparative analyses of these metrics showing the applicability to specific scenario or real data, conceptual similarities, and differences is still lacking.
We aim to bridge the gap between theoretical and practical application of these metrics. In this paper we describe several fair ranking metrics from the existing literature in a common notation, enabling direct comparison of their approaches and assumptions, and empirically compare them on the same experimental setup and data sets in the context of three information access tasks. We also provide a sensitivity analysis to assess the impact of the design choices and parameter settings that go in to these metrics and point to additional work needed to improve fairness measurement
Matched Pair Calibration for Ranking Fairness
We propose a test of fairness in score-based ranking systems called matched
pair calibration. Our approach constructs a set of matched item pairs with
minimal confounding differences between subgroups before computing an
appropriate measure of ranking error over the set. The matching step ensures
that we compare subgroup outcomes between identically scored items so that
measured performance differences directly imply unfairness in subgroup-level
exposures. We show how our approach generalizes the fairness intuitions of
calibration from a binary classification setting to ranking and connect our
approach to other proposals for ranking fairness measures. Moreover, our
strategy shows how the logic of marginal outcome tests extends to cases where
the analyst has access to model scores. Lastly, we provide an example of
applying matched pair calibration to a real-word ranking data set to
demonstrate its efficacy in detecting ranking bias.Comment: 19 pages, 8 figure
- …