11 research outputs found
Gender and Interest Targeting for Sponsored Post Advertising at Tumblr
As one of the leading platforms for creative content, Tumblr offers
advertisers a unique way of creating brand identity. Advertisers can tell their
story through images, animation, text, music, video, and more, and promote that
content by sponsoring it to appear as an advertisement in the streams of Tumblr
users. In this paper we present a framework that enabled one of the key
targeted advertising components for Tumblr, specifically gender and interest
targeting. We describe the main challenges involved in development of the
framework, which include creating the ground truth for training gender
prediction models, as well as mapping Tumblr content to an interest taxonomy.
For purposes of inferring user interests we propose a novel semi-supervised
neural language model for categorization of Tumblr content (i.e., post tags and
post keywords). The model was trained on a large-scale data set consisting of
6.8 billion user posts, with very limited amount of categorized keywords, and
was shown to have superior performance over the bag-of-words model. We
successfully deployed gender and interest targeting capability in Yahoo
production systems, delivering inference for users that cover more than 90% of
daily activities at Tumblr. Online performance results indicate advantages of
the proposed approach, where we observed 20% lift in user engagement with
sponsored posts as compared to untargeted campaigns.Comment: 10 pages, 9 figures, Proceedings of the 21th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD 2015), Sydney,
Australi
Search Retargeting using Directed Query Embeddings
ABSTRACT Determining user audience for online ad campaigns is a critical problem to companies competing in online advertising space. One of the most popular strategies is search retargeting, which involves targeting users that issued search queries related to advertiser's core business, commonly specified by advertisers themselves. However, advertisers often fail to include many relevant queries, which results in suboptimal campaigns and negatively impacts revenue for both advertisers and publishers. To address this issue, we use recently proposed neural language models to learn low-dimensional, distributed query embeddings, which can be used to expand query lists with related queries through simple nearest neighbor searches in the embedding space. Experiments on realworld data set strongly suggest benefits of the approach
Hidden Conditional Random Fields with Deep User Embeddings for Ad Targeting
Abstract—Estimating user’s propensity to click on a display ad or purchase a particular item is a critical task in targeted advertising, a burgeoning online industry worth billions of dollars. Better and more accurate estimation methods result in improved online experience for users, as only relevant and interesting ads are shown, and may also lead to large benefits for advertisers, as targeted users are more likely to click or make a purchase. In this paper we address this important problem, and propose an approach for improved estimation of ad click or conversion probability based on a sequence of user’s online actions, modeled using state-of-the-art Hidden Conditional Random Fields (HCRF) model. In addition, in order to address the sparsity issue at the input side of the HCRF model, we propose to learn a distributed, low-dimensional representation of user actions through a directed skip-gram model, a novel deep architecture suitable for sequential data. The experimental results on a real-world data set comprising thousands of online user sessions collected at servers of a large internet company clearly indicate the benefits and the potential of the proposed approach, which outperformed the competing state-of-the-art algorithms and obtained significant improvements in terms of retrieval measures. I
Non-Linear Label Ranking for Large-Scale Prediction of Long-Term User Interests
We consider the problem of personalization of online services from the viewpoint of ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertiser's revenue. We propose to address this problem as a task of ranking the ad categories depending on a user's preference, and introduce a novel label ranking approach capable of efficiently learning non-linear, highly accurate models in large-scale settings. Experiments on real-world advertising data set with more than 3.2 million users show that the proposed algorithm outperforms the existing solutions in terms of both rank loss and top-K retrieval performance, strongly suggesting the benefit of using the proposed model on large-scale ranking problems
Systems and methods for query rewriting
Systems and methods for rewriting query terms are disclosed. The system collects queries and query session data and separates the queries into sequences of queries having common sessions. The sequences of queries are then input into a deep learning network to build a multidimensional word vector in which related terms are nearer one another than unrelated terms. An input query is then received and the system matches the input query in the multidimensional word vector and rewrites the query using the nearest neighbors to the term of the input query