Search CORE

76,327 research outputs found

Incorporating Clicks, Attention and Satisfaction into a Search Engine Result Page Evaluation Model

Author: Chuklin Aleksandr
de Rijke Maarten
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Modern search engine result pages often provide immediate value to users and organize information in such a way that it is easy to navigate. The core ranking function contributes to this and so do result snippets, smart organization of result blocks and extensive use of one-box answers or side panels. While they are useful to the user and help search engines to stand out, such features present two big challenges for evaluation. First, the presence of such elements on a search engine result page (SERP) may lead to the absence of clicks, which is, however, not related to dissatisfaction, so-called "good abandonments." Second, the non-linear layout and visual difference of SERP items may lead to non-trivial patterns of user attention, which is not captured by existing evaluation metrics. In this paper we propose a model of user behavior on a SERP that jointly captures click behavior, user attention and satisfaction, the CAS model, and demonstrate that it gives more accurate predictions of user actions and self-reported satisfaction than existing models based on clicks alone. We use the CAS model to build a novel evaluation metric that can be applied to non-linear SERP layouts and that can account for the utility that users obtain directly on a SERP. We demonstrate that this metric shows better agreement with user-reported satisfaction than conventional evaluation metrics.Comment: CIKM2016, Proceedings of the 25th ACM International Conference on Information and Knowledge Management. 201

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Automatic Music Playlist Generation via Simulation-based Reinforcement Learning

Author: Cauteruccio Joseph
Ciosek Kamil
Dai Zhenwen
Kanoria Surya
Rinaldi Matteo
Tomasi Federico
Publication venue
Publication date: 13/10/2023
Field of study

Personalization of playlists is a common feature in music streaming services, but conventional techniques, such as collaborative filtering, rely on explicit assumptions regarding content quality to learn how to make recommendations. Such assumptions often result in misalignment between offline model objectives and online user satisfaction metrics. In this paper, we present a reinforcement learning framework that solves for such limitations by directly optimizing for user satisfaction metrics via the use of a simulated playlist-generation environment. Using this simulator we develop and train a modified Deep Q-Network, the action head DQN (AH-DQN), in a manner that addresses the challenges imposed by the large state and action space of our RL formulation. The resulting policy is capable of making recommendations from large and dynamic sets of candidate items with the expectation of maximizing consumption metrics. We analyze and evaluate agents offline via simulations that use environment models trained on both public and proprietary streaming datasets. We show how these agents lead to better user-satisfaction metrics compared to baseline methods during online A/B tests. Finally, we demonstrate that performance assessments produced from our simulator are strongly correlated with observed online metric results.Comment: 10 pages. KDD 2

arXiv.org e-Print Archive

Auditing Search Engines for Differential Satisfaction Across Demographics

Author: Anderson Ashton
Diaz Fernando
Mehrotra Rishabh
Sharma Amit
Wallach Hanna
Yilmaz Emine
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/05/2017
Field of study

Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised as being available to any user, regardless of their age, gender, or other demographic factors. However, there are growing concerns that these services may systematically underserve some groups of users. In this paper, we present a framework for internally auditing such services for differences in user satisfaction across demographic groups, using search engines as a case study. We first explain the pitfalls of na\"ively comparing the behavioral metrics that are commonly used to evaluate search engines. We then propose three methods for measuring latent differences in user satisfaction from observed differences in evaluation metrics. To develop these methods, we drew on ideas from the causal inference literature and the multilevel modeling literature. Our framework is broadly applicable to other online services, and provides general insight into interpreting their evaluation metrics.Comment: 8 pages Accepted at WWW 201

arXiv.org e-Print Archive

UCL Discovery

Beyond Cumulated Gain and Average Precision: Including Willingness and Expectation in the User Model

Author: Dupret Georges
Lalmas Mounia
Piwowarski Benjamin
Publication venue
Publication date: 01/01/2012
Field of study

In this paper, we define a new metric family based on two concepts: The definition of the stopping criterion and the notion of satisfaction, where the former depends on the willingness and expectation of a user exploring search results. Both concepts have been discussed so far in the IR literature, but we argue in this paper that defining a proper single valued metric depends on merging them into a single conceptual framework

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

A Framework proposal for monitoring and evaluating training in ERP implementation project

Author: Casanovas Garcia Josep
Esteves José
Pastor Collado Juan Antonio
Publication venue
Publication date: 01/01/2003
Field of study

During the last years some researchers have studied the topic of critical success factors in ERP implementations, out of which 'training' is cited as one of the most ones. Up to this moment, there is not enough research on the management and operationalization of critical success factors within ERP implementation projects.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Survey on Evaluation Methods for Dialogue Systems

Author: Agirre Eneko
Cieliebak Mark
Deriu Jan
Echegoyen Guillermo
Otegi Arantxa
Rodrigo Alvaro
Rosset Sophie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class

arXiv.org e-Print Archive

ZHAW digitalcollection