Search CORE

303 research outputs found

Bayesian Ranker Comparison Based on Historical User Interactions

Author: A Whiteson@uva Nl
Artem Grotov
Shimon Whiteson
Publication venue
Publication date: 05/03/2020
Field of study

ABSTRACT We address the problem of how to safely compare rankers for information retrieval. In particular, we consider how to control the risks associated with switching from an existing production ranker to a new candidate ranker. Whereas existing online comparison methods require showing potentially suboptimal result lists to users during the comparison process, which can lead to user frustration and abandonment, our approach only requires user interaction data generated through the natural use of the production ranker. Specifically, we propose a Bayesian approach for (1) comparing the production ranker to candidate rankers and (2) estimating the confidence of this comparison. The comparison of rankers is performed using click model-based information retrieval metrics, while the confidence of the comparison is derived from Bayesian estimates of uncertainty in the underlying click model. These confidence estimates are then used to determine whether a risk-averse decision criterion for switching to the candidate ranker has been satisfied. Experimental results on several learning to rank datasets and on a click log show that the proposed approach outperforms an existing ranker comparison method that does not take uncertainty into account

CiteSeerX

Optimizing web search engines with interactions

Author: Grotov A.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Optimizing web search engines with interactions

Author: Grotov A.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Search engines that learn from their users

Author: Schuth A.G.
Publication venue
Publication date: 01/01/2016
Field of study

More than half the world's population uses web search engines, resulting in over half a billion queries every single day. For many people, web search engines such as Baidu, Bing, Google, and Yandex are among the first resources they go to when a question arises. Moreover, for many search engines have become the most trusted route to information, more so even than traditional media such as newspapers, news websites or news channels on television. What web search engines present people with greatly influences what they believe to be true and consequently it influences their thoughts, opinions, decisions, and the actions they take. With this in mind two things are important, from an information retrieval research perspective. First, it is important to understand how well search engines (rankers) perform and secondly this knowledge should be used to improve them. This thesis is about these two topics: evaluation of search engines and learning search engines. In the first part of this thesis we investigate how user interactions with search engines can be used to evaluate search engines. In particular, we introduce a new online evaluation paradigm called multileaving that extends upon interleaving. With multileaving, many rankers can be compared at once by combining document lists from these rankers into a single result list and attributing user interactions with this list to the rankers. Then we investigate the relation between A/B testing and interleaved comparison methods. Both studies lead to much higher sensitivity of the evaluation methods, meaning that fewer user interactions are required to arrive at reliable conclusions. This has the important implication that fewer users need to be exposed to the results from possibly inferior search engines. In the second part of this thesis we turn to online learning to rank. We learn from the evaluation methods introduced and extended upon in the first part. We learn the parameters of base rankers based on user interactions. Then we use the multileaving methods as feedback in our learning method, leading to much faster convergence than existing methods. Again, the important implication is that fewer users need to be exposed to possibly inferior search engines as they adapt more quickly to changes in user preferences. The last part of this thesis is of a different nature than the earlier two parts. As opposed to the earlier chapters, we no longer study algorithms. Progress in information retrieval research has always been driven by a combination of algorithms, shared resources, and evaluation. In the last part we focus on the latter two. We introduce a new shared resource and a new evaluation paradigm. Firstly, we propose Lerot. Lerot is an online evaluation framework that allows us to simulate users interacting with a search engine. Our implementation has been released as open source software and is currently being used by researchers around the world. Secondly we introduce OpenSearch, a new evaluation paradigm involving real users of real search engines. We describe an implementation of this paradigm that has already been widely adopted by the research community through challenges at CLEF and TREC.</jats:p

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Efficient Exploration of Gradient Space for Online Learning to Rank

Author: Kim Sonwoo
Langley Ramsey
McCord-Snook Eric
Wang Hongning
Wang Huazheng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/05/2018
Field of study

Online learning to rank (OL2R) optimizes the utility of returned search results based on implicit feedback gathered directly from users. To improve the estimates, OL2R algorithms examine one or more exploratory gradient directions and update the current ranker if a proposed one is preferred by users via an interleaved test. In this paper, we accelerate the online learning process by efficient exploration in the gradient space. Our algorithm, named as Null Space Gradient Descent, reduces the exploration space to only the \emph{null space} of recent poorly performing gradients. This prevents the algorithm from repeatedly exploring directions that have been discouraged by the most recent interactions with users. To improve sensitivity of the resulting interleaved test, we selectively construct candidate rankers to maximize the chance that they can be differentiated by candidate ranking documents in the current query; and we use historically difficult queries to identify the best ranker when tie occurs in comparing the rankers. Extensive experimental comparisons with the state-of-the-art OL2R algorithms on several public benchmarks confirmed the effectiveness of our proposal algorithm, especially in its fast learning convergence and promising ranking quality at an early stage.Comment: To appear on SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieva

arXiv.org e-Print Archive

Crossref

Cognitive Personalized Search Integrating Large Language Models with an Efficient Memory Mechanism

Author: Dou Zhicheng
Jin Jiajie
Zhou Yujia
Zhu Qiannan
Publication venue
Publication date: 16/02/2024
Field of study

Traditional search engines usually provide identical search results for all users, overlooking individual preferences. To counter this limitation, personalized search has been developed to re-rank results based on user preferences derived from query logs. Deep learning-based personalized search methods have shown promise, but they rely heavily on abundant training data, making them susceptible to data sparsity challenges. This paper proposes a Cognitive Personalized Search (CoPS) model, which integrates Large Language Models (LLMs) with a cognitive memory mechanism inspired by human cognition. CoPS employs LLMs to enhance user modeling and user search experience. The cognitive memory mechanism comprises sensory memory for quick sensory responses, working memory for sophisticated cognitive responses, and long-term memory for storing historical interactions. CoPS handles new queries using a three-step approach: identifying re-finding behaviors, constructing user profiles with relevant historical information, and ranking documents based on personalized query intent. Experiments show that CoPS outperforms baseline models in zero-shot scenarios.Comment: Accepted by WWW 202

arXiv.org e-Print Archive

Development of a Holistic Machine Learning-Based Approach for Building Energy Consumption Prediction under Limited Data Conditions

Author: Qiao Qingyao
Publication venue
Publication date: 01/08/2023
Field of study

The University of Manchester - Institutional Repository

On understanding, modeling and predicting user behavior in web search

Author: Borisov A.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

On understanding, modeling and predicting user behavior in web search

Author: Borisov A.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Learning Colour Representations of Search Queries

Author: De Weijer Joost Van
Havasi Catherine
Laenen Katrien
Lin Sharon
Lindner Albrecht
Lindner Albrecht
Lindner Albrecht
Meo Timothy
Moroney Nathan
Ngiam Jiquan
Sarifuddin M
Schauerte Boris
Smith John R
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/06/2020
Field of study

Image search engines rely on appropriately designed ranking features that capture various aspects of the content semantics as well as the historic popularity. In this work, we consider the role of colour in this relevance matching process. Our work is motivated by the observation that a significant fraction of user queries have an inherent colour associated with them. While some queries contain explicit colour mentions (such as 'black car' and 'yellow daisies'), other queries have implicit notions of colour (such as 'sky' and 'grass'). Furthermore, grounding queries in colour is not a mapping to a single colour, but a distribution in colour space. For instance, a search for 'trees' tends to have a bimodal distribution around the colours green and brown. We leverage historical clickthrough data to produce a colour representation for search queries and propose a recurrent neural network architecture to encode unseen queries into colour space. We also show how this embedding can be learnt alongside a cross-modal relevance ranker from impression logs where a subset of the result images were clicked. We demonstrate that the use of a query-image colour distance feature leads to an improvement in the ranker performance as measured by users' preferences of clicked versus skipped images.Comment: Accepted as a full paper at SIGIR 202

arXiv.org e-Print Archive

Crossref