6 research outputs found
Ranking for Relevance and Display Preferences in Complex Presentation Layouts
Learning to Rank has traditionally considered settings where given the
relevance information of objects, the desired order in which to rank the
objects is clear. However, with today's large variety of users and layouts this
is not always the case. In this paper, we consider so-called complex ranking
settings where it is not clear what should be displayed, that is, what the
relevant items are, and how they should be displayed, that is, where the most
relevant items should be placed. These ranking settings are complex as they
involve both traditional ranking and inferring the best display order. Existing
learning to rank methods cannot handle such complex ranking settings as they
assume that the display order is known beforehand. To address this gap we
introduce a novel Deep Reinforcement Learning method that is capable of
learning complex rankings, both the layout and the best ranking given the
layout, from weak reward signals. Our proposed method does so by selecting
documents and positions sequentially, hence it ranks both the documents and
positions, which is why we call it the Double-Rank Model (DRM). Our experiments
show that DRM outperforms all existing methods in complex ranking settings,
thus it leads to substantial ranking improvements in cases where the display
order is not known a priori
Cascading Hybrid Bandits: Online Learning to Rank for Relevance and Diversity
Relevance ranking and result diversification are two core areas in modern
recommender systems. Relevance ranking aims at building a ranked list sorted in
decreasing order of item relevance, while result diversification focuses on
generating a ranked list of items that covers a broad range of topics. In this
paper, we study an online learning setting that aims to recommend a ranked list
with items that maximizes the ranking utility, i.e., a list whose items are
relevant and whose topics are diverse. We formulate it as the cascade hybrid
bandits (CHB) problem. CHB assumes the cascading user behavior, where a user
browses the displayed list from top to bottom, clicks the first attractive
item, and stops browsing the rest. We propose a hybrid contextual bandit
approach, called CascadeHybrid, for solving this problem. CascadeHybrid models
item relevance and topical diversity using two independent functions and
simultaneously learns those functions from user click feedback. We conduct
experiments to evaluate CascadeHybrid on two real-world recommendation
datasets: MovieLens and Yahoo music datasets. Our experimental results show
that CascadeHybrid outperforms the baselines. In addition, we prove theoretical
guarantees on the -step performance demonstrating the soundness of
CascadeHybrid
Constructing an Interaction Behavior Model for Web Image Search
User interaction behavior is a valuable source of implicit relevance
feedback. In Web image search a different type of search result presentation is
used than in general Web search, which leads to different interaction
mechanisms and user behavior. For example, image search results are
self-contained, so that users do not need to click the results to view the
landing page as in general Web search, which generates sparse click data. Also,
two-dimensional result placement instead of a linear result list makes browsing
behaviors more complex. Thus, it is hard to apply standard user behavior models
(e.g., click models) developed for general Web search to Web image search.
In this paper, we conduct a comprehensive image search user behavior analysis
using data from a lab-based user study as well as data from a commercial search
log. We then propose a novel interaction behavior model, called grid-based user
browsing model (GUBM), whose design is motivated by observations from our data
analysis. GUBM can both capture users' interaction behavior, including cursor
hovering, and alleviate position bias. The advantages of GUBM are two-fold: (1)
It is based on an unsupervised learning method and does not need manually
annotated data for training. (2) It is based on user interaction features on
search engine result pages (SERPs) and is easily transferable to other
scenarios that have a grid-based interface such as video search engines. We
conduct extensive experiments to test the performance of our model using a
large-scale commercial image search log. Experimental results show that in
terms of behavior prediction (perplexity), and topical relevance and image
quality (normalized discounted cumulative gain (NDCG)), GUBM outperforms
state-of-the-art baseline models as well as the original ranking. We make the
implementation of GUBM and related datasets publicly available for future
studies.Comment: 10 page
Learning from User Interactions with Rankings: A Unification of the Field
Ranking systems form the basis for online search engines and recommendation
services. They process large collections of items, for instance web pages or
e-commerce products, and present the user with a small ordered selection. The
goal of a ranking system is to help a user find the items they are looking for
with the least amount of effort. Thus the rankings they produce should place
the most relevant or preferred items at the top of the ranking. Learning to
rank is a field within machine learning that covers methods which optimize
ranking systems w.r.t. this goal. Traditional supervised learning to rank
methods utilize expert-judgements to evaluate and learn, however, in many
situations such judgements are impossible or infeasible to obtain. As a
solution, methods have been introduced that perform learning to rank based on
user clicks instead. The difficulty with clicks is that they are not only
affected by user preferences, but also by what rankings were displayed.
Therefore, these methods have to prevent being biased by other factors than
user preference. This thesis concerns learning to rank methods based on user
clicks and specifically aims to unify the different families of these methods.
As a whole, the second part of this thesis proposes a framework that bridges
many gaps between areas of online, counterfactual, and supervised learning to
rank. It has taken approaches, previously considered independent, and unified
them into a single methodology for widely applicable and effective learning to
rank from user clicks.Comment: PhD Thesis of Harrie Oosterhuis defended at the University of
Amsterdam on November 27th 202