8 research outputs found

    Managing Risk of Bidding in Display Advertising

    Full text link
    In this paper, we deal with the uncertainty of bidding for display advertising. Similar to the financial market trading, real-time bidding (RTB) based display advertising employs an auction mechanism to automate the impression level media buying; and running a campaign is no different than an investment of acquiring new customers in return for obtaining additional converted sales. Thus, how to optimally bid on an ad impression to drive the profit and return-on-investment becomes essential. However, the large randomness of the user behaviors and the cost uncertainty caused by the auction competition may result in a significant risk from the campaign performance estimation. In this paper, we explicitly model the uncertainty of user click-through rate estimation and auction competition to capture the risk. We borrow an idea from finance and derive the value at risk for each ad display opportunity. Our formulation results in two risk-aware bidding strategies that penalize risky ad impressions and focus more on the ones with higher expected return and lower risk. The empirical study on real-world data demonstrates the effectiveness of our proposed risk-aware bidding strategies: yielding profit gains of 15.4% in offline experiments and up to 17.5% in an online A/B test on a commercial RTB platform over the widely applied bidding strategies

    Tackling Biased Baselines in the Risk-Sensitive Evaluation of Retrieval Systems

    Full text link
    Abstract. The aim of optimising information retrieval (IR) systems using a risk-sensitive evaluation methodology is to minimise the risk of performing any par-ticular topic less effectively than a given baseline system. Baseline systems in this context determine the reference effectiveness for topics, relative to which the effectiveness of a given IR system in minimising the risk will be measured. How-ever, the comparative risk-sensitive evaluation of a set of diverse IR systems – as attempted by the TREC 2013 Web track – is challenging, as the different systems under evaluation may be based upon a variety of different (base) retrieval models, such as learning to rank or language models. Hence, a question arises about how to properly measure the risk exhibited by each system. In this paper, we argue that no model of information retrieval alone is representative enough in this respect to be a true reference for the models available in the current state-of-the-art, and demonstrate, using the TREC 2012 Web track data, that as the baseline system changes, the resulting risk-based ranking of the systems changes significantly. In-stead of using a particular system’s effectiveness as the reference effectiveness for topics, we propose several remedies including the use of mean within-topic sys-tem effectiveness as a baseline, which is shown to enable unbiased measurements of the risk-sensitive effectiveness of IR systems.

    A contextual recurrent collaborative filtering framework for modelling sequences of venue checkins

    Get PDF
    Context-Aware Venue Recommendation (CAVR) systems aim to effectively generate a ranked list of interesting venues users should visit based on their historical feedback (e.g. checkins) and context (e.g. the time of the day or the user’s current location). Such systems are increasingly deployed by Location-based Social Networks (LBSNs) such as Foursquare and Yelp to enhance the satisfaction of the users. Matrix Factorisation (MF) is a popular Collaborative Filtering (CF) technique that can suggest relevant venues to users based on an assumption that similar users are likely to visit similar venues. In recent years, deep neural networks have been successfully applied to recommendation systems. Indeed, various approaches have been previously proposed in the literature to enhance the effectiveness of MF-based approaches by exploiting Recurrent Neural Networks (RNN) models to capture the sequential properties of observed checkins. Moreover, recently, several RNN architectures have been proposed to incorporate contextual information associated with the users’ sequence of checkins (for instance, the time interval or the geographical distance between two successive checkins) to effectively capture such short-term preferences of users. In this work, we propose a Contextual Recurrent Collaborative Filtering framework (CRCF) that leverages the users’ preferred context and the contextual information associated with the users’ sequence of checkins in order to model the users’ short-term preferences for CAVR. In particular, the CRCF framework is built upon two state-of-the-art approaches: namely Deep Recurrent Collaborative Filtering framework (DRCF) and Contextual Attention Recurrent Architecture (CARA). Thorough experiments on three large checkin and rating datasets from commercial LBSNs demonstrate the effectiveness and robustness of our proposed CRCF framework by significantly outperforming various state-of-the-art matrix factorisation approaches. In particular, the CRCF framework significantly improves NDCG@10 by 5–20% over the state-of-the-art DRCF framework (Manotumruksa, Macdonald, and Ounis, 2017a) and the CARA architecture (Manotumruksa, Macdonald, and Ounis, 2018) across the three datasets. Furthermore, the CRCF framework is less significantly risky than both the DRCF framework and the CARA architecture across the three datasets

    Mining document, concept, and term associations for effective biomedical retrieval - Introducing MeSH-enhanced retrieval models

    Get PDF
    Manually assigned subject terms, such as Medical Subject Headings (MeSH) in the health domain, describe the concepts or topics of a document. Existing information retrieval models do not take full advantage of such information. In this paper, we propose two MeSH-enhanced (ME) retrieval models that integrate the concept layer (i.e. MeSH) into the language modeling framework to improve retrieval performance. The new models quantify associations between documents and their assigned concepts to construct conceptual representations for the documents, and mine associations between concepts and terms to construct generative concept models. The two ME models reconstruct two essential estimation processes of the relevance model (Lavrenko and Croft 2001) by incorporating the document-concept and the concept-term associations. More specifically, in Model 1, language models of the pseudo-feedback documents are enriched by their assigned concepts. In Model 2, concepts that are related to users’ queries are first identified, and then used to reweight the pseudo-feedback documents according to the document-concept associations. Experiments carried out on two standard test collections show that the ME models outperformed the query likelihood model, the relevance model (RM3), and an earlier ME model. A detailed case analysis provides insight into how and why the new models improve/worsen retrieval performance. Implications and limitations of the study are discussed. This study provides new ways to formally incorporate semantic annotations, such as subject terms, into retrieval models. The findings of this study suggest that integrating the concept layer into retrieval models can further improve the performance over the current state-of-the-art models.Ye

    Probabilistic Modeling in Dynamic Information Retrieval

    Get PDF
    Dynamic modeling is used to design systems that are adaptive to their changing environment and is currently poorly understood in information retrieval systems. Common elements in the information retrieval methodology, such as documents, relevance, users and tasks, are dynamic entities that may evolve over the course of several interactions, which is increasingly captured in search log datasets. Conventional frameworks and models in information retrieval treat these elements as static, or only consider local interactivity, without consideration for the optimisation of all potential interactions. Further to this, advances in information retrieval interface, contextual personalization and ad display demand models that can intelligently react to users over time. This thesis proposes a new area of information retrieval research called Dynamic Information Retrieval. The term dynamics is defined and what it means within the context of information retrieval. Three examples of current areas of research in information retrieval which can be described as dynamic are covered: multi-page search, online learning to rank and session search. A probabilistic model for dynamic information retrieval is introduced and analysed, and applied in practical algorithms throughout. This framework is based on the partially observable Markov decision process model, and solved using dynamic programming and the Bellman equation. Comparisons are made against well-established techniques that show improvements in ranking quality and in particular, document diversification. The limitations of this approach are explored and appropriate approximation techniques are investigated, resulting in the development of an efficient multi-armed bandit based ranking algorithm. Finally, the extraction of dynamic behaviour from search logs is also demonstrated as an application, showing that dynamic information retrieval modeling is an effective and versatile tool in state of the art information retrieval research
    corecore