11 research outputs found

    Gender and Interest Targeting for Sponsored Post Advertising at Tumblr

    Full text link
    As one of the leading platforms for creative content, Tumblr offers advertisers a unique way of creating brand identity. Advertisers can tell their story through images, animation, text, music, video, and more, and promote that content by sponsoring it to appear as an advertisement in the streams of Tumblr users. In this paper we present a framework that enabled one of the key targeted advertising components for Tumblr, specifically gender and interest targeting. We describe the main challenges involved in development of the framework, which include creating the ground truth for training gender prediction models, as well as mapping Tumblr content to an interest taxonomy. For purposes of inferring user interests we propose a novel semi-supervised neural language model for categorization of Tumblr content (i.e., post tags and post keywords). The model was trained on a large-scale data set consisting of 6.8 billion user posts, with very limited amount of categorized keywords, and was shown to have superior performance over the bag-of-words model. We successfully deployed gender and interest targeting capability in Yahoo production systems, delivering inference for users that cover more than 90% of daily activities at Tumblr. Online performance results indicate advantages of the proposed approach, where we observed 20% lift in user engagement with sponsored posts as compared to untargeted campaigns.Comment: 10 pages, 9 figures, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015), Sydney, Australi

    Search Retargeting using Directed Query Embeddings

    No full text
    ABSTRACT Determining user audience for online ad campaigns is a critical problem to companies competing in online advertising space. One of the most popular strategies is search retargeting, which involves targeting users that issued search queries related to advertiser's core business, commonly specified by advertisers themselves. However, advertisers often fail to include many relevant queries, which results in suboptimal campaigns and negatively impacts revenue for both advertisers and publishers. To address this issue, we use recently proposed neural language models to learn low-dimensional, distributed query embeddings, which can be used to expand query lists with related queries through simple nearest neighbor searches in the embedding space. Experiments on realworld data set strongly suggest benefits of the approach

    Hidden Conditional Random Fields with Deep User Embeddings for Ad Targeting

    No full text
    Abstract—Estimating user’s propensity to click on a display ad or purchase a particular item is a critical task in targeted advertising, a burgeoning online industry worth billions of dollars. Better and more accurate estimation methods result in improved online experience for users, as only relevant and interesting ads are shown, and may also lead to large benefits for advertisers, as targeted users are more likely to click or make a purchase. In this paper we address this important problem, and propose an approach for improved estimation of ad click or conversion probability based on a sequence of user’s online actions, modeled using state-of-the-art Hidden Conditional Random Fields (HCRF) model. In addition, in order to address the sparsity issue at the input side of the HCRF model, we propose to learn a distributed, low-dimensional representation of user actions through a directed skip-gram model, a novel deep architecture suitable for sequential data. The experimental results on a real-world data set comprising thousands of online user sessions collected at servers of a large internet company clearly indicate the benefits and the potential of the proposed approach, which outperformed the competing state-of-the-art algorithms and obtained significant improvements in terms of retrieval measures. I

    Non-Linear Label Ranking for Large-Scale Prediction of Long-Term User Interests

    No full text
    We consider the problem of personalization of online services from the viewpoint of ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertiser's revenue. We propose to address this problem as a task of ranking the ad categories depending on a user's preference, and introduce a novel label ranking approach capable of efficiently learning non-linear, highly accurate models in large-scale settings. Experiments on real-world advertising data set with more than 3.2 million users show that the proposed algorithm outperforms the existing solutions in terms of both rank loss and top-K retrieval performance, strongly suggesting the benefit of using the proposed model on large-scale ranking problems

    Systems and methods for query rewriting

    No full text
    Systems and methods for rewriting query terms are disclosed. The system collects queries and query session data and separates the queries into sequences of queries having common sessions. The sequences of queries are then input into a deep learning network to build a multidimensional word vector in which related terms are nearer one another than unrelated terms. An input query is then received and the system matches the input query in the multidimensional word vector and rewrites the query using the nearest neighbors to the term of the input query
    corecore