1,120 research outputs found
Context Models For Web Search Personalization
We present our solution to the Yandex Personalized Web Search Challenge. The
aim of this challenge was to use the historical search logs to personalize
top-N document rankings for a set of test users. We used over 100 features
extracted from user- and query-depended contexts to train neural net and
tree-based learning-to-rank and regression models. Our final submission, which
was a blend of several different models, achieved an NDCG@10 of 0.80476 and
placed 4'th amongst the 194 teams winning 3'rd prize
Contextualised Browsing in a Digital Library's Living Lab
Contextualisation has proven to be effective in tailoring \linebreak search
results towards the users' information need. While this is true for a basic
query search, the usage of contextual session information during exploratory
search especially on the level of browsing has so far been underexposed in
research. In this paper, we present two approaches that contextualise browsing
on the level of structured metadata in a Digital Library (DL), (1) one variant
bases on document similarity and (2) one variant utilises implicit session
information, such as queries and different document metadata encountered during
the session of a users. We evaluate our approaches in a living lab environment
using a DL in the social sciences and compare our contextualisation approaches
against a non-contextualised approach. For a period of more than three months
we analysed 47,444 unique retrieval sessions that contain search activities on
the level of browsing. Our results show that a contextualisation of browsing
significantly outperforms our baseline in terms of the position of the first
clicked item in the result set. The mean rank of the first clicked document
(measured as mean first relevant - MFR) was 4.52 using a non-contextualised
ranking compared to 3.04 when re-ranking the result lists based on similarity
to the previously viewed document. Furthermore, we observed that both
contextual approaches show a noticeably higher click-through rate. A
contextualisation based on document similarity leads to almost twice as many
document views compared to the non-contextualised ranking.Comment: 10 pages, 2 figures, paper accepted at JCDL 201
Personalized Ranking in eCommerce Search
We address the problem of personalization in the context of eCommerce search.
Specifically, we develop personalization ranking features that use in-session
context to augment a generic ranker optimized for conversion and relevance. We
use a combination of latent features learned from item co-clicks in historic
sessions and content-based features that use item title and price.
Personalization in search has been discussed extensively in the existing
literature. The novelty of our work is combining and comparing content-based
and content-agnostic features and showing that they complement each other to
result in a significant improvement of the ranker. Moreover, our technique does
not require an explicit re-ranking step, does not rely on learning user
profiles from long term search behavior, and does not involve complex modeling
of query-item-user features. Our approach captures item co-click propensity
using lightweight item embeddings. We experimentally show that our technique
significantly outperforms a generic ranker in terms of Mean Reciprocal Rank
(MRR). We also provide anecdotal evidence for the semantic similarity captured
by the item embeddings on the eBay search engine.Comment: Under Revie
Predicting Session Length in Media Streaming
Session length is a very important aspect in determining a user's
satisfaction with a media streaming service. Being able to predict how long a
session will last can be of great use for various downstream tasks, such as
recommendations and ad scheduling. Most of the related literature on user
interaction duration has focused on dwell time for websites, usually in the
context of approximating post-click satisfaction either in search results, or
display ads. In this work we present the first analysis of session length in a
mobile-focused online service, using a real world data-set from a major music
streaming service. We use survival analysis techniques to show that the
characteristics of the length distributions can differ significantly between
users, and use gradient boosted trees with appropriate objectives to predict
the length of a session using only information available at its beginning. Our
evaluation on real world data illustrates that our proposed technique
outperforms the considered baseline.Comment: 4 pages, 3 figure
Why People Search for Images using Web Search Engines
What are the intents or goals behind human interactions with image search
engines? Knowing why people search for images is of major concern to Web image
search engines because user satisfaction may vary as intent varies. Previous
analyses of image search behavior have mostly been query-based, focusing on
what images people search for, rather than intent-based, that is, why people
search for images. To date, there is no thorough investigation of how different
image search intents affect users' search behavior.
In this paper, we address the following questions: (1)Why do people search
for images in text-based Web image search systems? (2)How does image search
behavior change with user intent? (3)Can we predict user intent effectively
from interactions during the early stages of a search session? To this end, we
conduct both a lab-based user study and a commercial search log analysis.
We show that user intents in image search can be grouped into three classes:
Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals
different user behavior patterns under these three intents, such as first click
time, query reformulation, dwell time and mouse movement on the result page.
Based on user interaction features during the early stages of an image search
session, that is, before mouse scroll, we develop an intent classifier that is
able to achieve promising results for classifying intents into our three intent
classes. Given that all features can be obtained online and unobtrusively, the
predicted intents can provide guidance for choosing ranking methods immediately
after scrolling
Adapting Triplet Importance of Implicit Feedback for Personalized Recommendation
Implicit feedback is frequently used for developing personalized
recommendation services due to its ubiquity and accessibility in real-world
systems. In order to effectively utilize such information, most research adopts
the pairwise ranking method on constructed training triplets (user, positive
item, negative item) and aims to distinguish between positive items and
negative items for each user. However, most of these methods treat all the
training triplets equally, which ignores the subtle difference between
different positive or negative items. On the other hand, even though some other
works make use of the auxiliary information (e.g., dwell time) of user
behaviors to capture this subtle difference, such auxiliary information is hard
to obtain. To mitigate the aforementioned problems, we propose a novel training
framework named Triplet Importance Learning (TIL), which adaptively learns the
importance score of training triplets. We devise two strategies for the
importance score generation and formulate the whole procedure as a bilevel
optimization, which does not require any rule-based design. We integrate the
proposed training procedure with several Matrix Factorization (MF)- and Graph
Neural Network (GNN)-based recommendation models, demonstrating the
compatibility of our framework. Via a comparison using three real-world
datasets with many state-of-the-art methods, we show that our proposed method
outperforms the best existing models by 3-21\% in terms of Recall@k for the
top-k recommendation
Enhanced information retrieval using domain-specific recommender models
The objective of an information retrieval (IR) system is to retrieve relevant items which meet a user information need. There is currently significant interest in personalized IR which seeks to improve IR effectiveness by incorporating a model of the user’s interests. However, in some situations
there may be no opportunity to learn about the interests of a specific user on a certain topic. In our work, we propose an IR approach which combines a recommender algorithm with IR methods to improve retrieval for domains where the system has no opportunity to learn prior information about the user’s knowledge of a domain for which they have not previously entered a query. We use search data from other previous users interested in the same topic to build a
recommender model for this topic. When a user enters a query on a topic, new to this user, an appropriate recommender model is selected and used to predict a ranking which the user may find interesting based on the behaviour of previous
users with similar queries. The recommender output is integrated with a standard IR method in a weighted linear combination to provide a final result for the user. Experiments using the INEX 2009 data collection with a simulated recommender training set show that our approach can improve on a baseline IR system
- …