5 research outputs found
Learning from User Interactions with Rankings: A Unification of the Field
Ranking systems form the basis for online search engines and recommendation
services. They process large collections of items, for instance web pages or
e-commerce products, and present the user with a small ordered selection. The
goal of a ranking system is to help a user find the items they are looking for
with the least amount of effort. Thus the rankings they produce should place
the most relevant or preferred items at the top of the ranking. Learning to
rank is a field within machine learning that covers methods which optimize
ranking systems w.r.t. this goal. Traditional supervised learning to rank
methods utilize expert-judgements to evaluate and learn, however, in many
situations such judgements are impossible or infeasible to obtain. As a
solution, methods have been introduced that perform learning to rank based on
user clicks instead. The difficulty with clicks is that they are not only
affected by user preferences, but also by what rankings were displayed.
Therefore, these methods have to prevent being biased by other factors than
user preference. This thesis concerns learning to rank methods based on user
clicks and specifically aims to unify the different families of these methods.
As a whole, the second part of this thesis proposes a framework that bridges
many gaps between areas of online, counterfactual, and supervised learning to
rank. It has taken approaches, previously considered independent, and unified
them into a single methodology for widely applicable and effective learning to
rank from user clicks.Comment: PhD Thesis of Harrie Oosterhuis defended at the University of
Amsterdam on November 27th 202
Optimizing interactive systems with data-driven objectives
Building interactive systems requires a lot of effort, and understanding what users want and designing corresponding optimization objectives are some of the critical components. The reliability of hand-crafted objectives strongly relies on the amount of domain knowledge incorporated in them. In the first part of this thesis, we explore how to optimize interactive systems without hand-crafting objectives in a more general setup. Our solution requires no domain knowledge and is thus even applicable when prior knowledge is absent. In the second part of the thesis, we utilize the idea of data-driven objectives for two types of interactive systems: open-domain dialogue systems and task-oriented dialogue systems. Besides exploring the promising usage scenarios of data-driven objectives, we also investigate the limitations and potential problems of current deep reinforcement learning-based solutions for dialogue policy learning in task-oriented dialogue systems in the last part of this thesis